Rounding Float Values in ML Models

Question

Let's assume I have a column with float values (e.g., 3.12334354454, 5.75434331354, and so on). If I round these values to two decimal places (e.g., 3.12, 5.75),

I think the advantages and disadvantages of doing this would be:

Advantages:

Less memory required: Rounded values take up less space.
Smoother and faster calculations: Simplified values can speed up processing.
Better segregation: Rounded values might help in categorizing data more effectively. (Not sure).

Disadvantages:

Loss of information: Precision is reduced, which can be critical in some use cases.

Please provide your thoughts.

Guna · Accepted Answer · 2025-04-30 11:33:20Z

5

On top of the advantages you mentioned, rounding can also prevent overfitting. E.g., if you have a value of 1.234567 with target 0 and a value of 1.234568 with target 1, a tree model might want to learn a boundary at the middle point 1.2345675. If you round the model will learn 0.5 for 1.23 which might be more realistic.

An easy way to better see this is to consider the rounding operation as adding some triangular noise (not so far of gaussian noise).

As a side note, just be aware that in some language the operation of rounding will not change data type. Typically in python rounding will not reduce memory use, you have to change data type to reduce memory use. E.g. float32 to float16. After that the memory gain will also depends on how your ML pipeline handle datatypes.

edited 10 hours ago

Guna

39013 bronze badges

answered 12 hours ago

Lucas Morin

2,6635 gold badges24 silver badges47 bronze badges

$\begingroup$ The disadvantage "loses information" is exactly the same as the advantage of "reduces overfitting", <something something bias-variance tradeoff>. $\endgroup$
– Ben Reiniger ♦
Commented 18 mins ago

Add a comment |

pyrochlor · Accepted Answer · 2025-04-30 11:27:08Z

3

The numeric precision is only reduced, if you round to a smaller number of digits than your input data really has.(It is not a waste of time to think about if the given decimals are realistic) Otherwise your many digits are just mathematical artifacts.

If you really loose significant information if you drop the least sign. decimal, then you should probably think about your scaling. That does not sound like a robust model.

answered 10 hours ago

pyrochlor

3867 bronze badges

Add a comment |

adsp42 · Accepted Answer · 2025-04-30 19:47:16Z

Less memory required: Rounded values take up less space.

No. You reduced precision, the numbers are still Float, same number of bits in memory. Rounded means Float -> Integer, that would indeed reduce memory. E.g. 3.12 -> 3.

Smoother and faster calculations: Simplified values can speed up processing.

Not if you still use Float. True if you use Integers. More suitable to hardware accelerators.

Better segregation: Rounded values might help in categorizing data more effectively.

As suggested by others, adding a bit of noise to your training data could help with generalisation and preventing overfitting.

Loss of information: Precision is reduced, which can be critical in some use cases.

True, could be critical, especially if you perform true rounding (to int types).

Note. Using int types (a) is a valid technique, especially in inference, on embedded platforms (e.g. mobile phone - standalone, no cloud). Even when initial training (b) of the model was done with Float (in a much powerful environment, data centres multiplrocessing, network, cloud).

(a) even int types with a very reduced number of bits, e.g. 8-bit int, or even less.

(b) might need some post-training adaptation.

JimmyJames supports Canada · Accepted Answer · 2025-04-30 21:25:20Z

As mentioned in other answers, rounding a 32-bit float to another 32-bit float will not save any memory. All 32-bit floats use 32 bits of storage.

Another thing to consider is that decimal rounding can be problematic with floating point values. The reason is that there's no exact representation (for example) of 0.1 in floating point. A floating-point number is binary. The fractions that can be represented exactly with floating point are 0.5, 1/4, 1/8, 1/16, etc. and sums of those values. The decimal value 0.1 is a repeating value in this scheme. So, if you round 0.125 to 0.1, you've actually taken an exact value that is easily represented as a float and turned it into non-exact value which uses all places available in the float. In other words, decimal rounding may increase the places used in a floating value.

If you store your data as CSV or some other textual value, rounding will reduce the space required for that format. This may be useful, especially if you end up with false precision when converting from floats to decimals.

And lastly, while I don't think it's a common situation, you should be aware of the 'butterfly effect' which was coined by someone who found that rounding values used in one of their models caused it to produce wildly different results:

This was enough to tell me what had happened: the numbers that I had typed in were not the exact original numbers, but were the rounded-off values that had appeared in the original printout. The initial round-off errors were the culprits; they were steadily amplifying until they dominated the solution.

Stack Exchange Network

Rounding Float Values in ML Models

4 Answers 4

Your Answer

Hot Network Questions

Rounding Float Values in ML Models

4 Answers 4

Your Answer

Sign up or log in

Post as a guest

Related

Hot Network Questions