Feature #14362
closeduse BigDecimal instead of Float by default
Description
When writing a decimal the default type assigned is Float
:
> 1.2.class
=> Float
This is great for memory savings and for application speed but it comes with accuracy issues:
> 129.95 * 100
=> 12994.999999999998
Ruby's own BigDecimal
docs say:
Decimal arithmetic is also useful for general calculation, because it provides the correct answers people expect–whereas normal binary floating point arithmetic often introduces subtle errors because of the conversion between base 10 and base 2.
What if BigDecimal
was moved into the Ruby core and made the default for numbers like 1.2
?
> 1.2.class
=> BigDecimal
I realize this goes against the 3x3 goal but I think BigDecimal
is preferable over Float
for developer happiness. I've seen lots of developers stumble when first learning about the pitfalls of Float
. I've see test suites where a range is tested for because of answers like 12994.999999999998
instead of 12995.0
. At one point trading accuracy for performance made sense. I'm not sure that's still the case today.
Right now a decimal generates the faster and less accurate Float
. Developers have to opt-in to the slower but safer BigDecimal
by manually requesting a BigDecimal
. By flipping this we default to the safer version and ask developers to opt-in to the faster but less accurate Float
if needed.
> 1.2.class
=> Decimal
> Float.new('1.2')
=> 1.2
There could also be a shorthand for float where the number is followed by an f
(similar to Rational).
1.2f # => Float
The change would help "provide the correct answers people expect". The change would be mostly seamless from an interface standpoint. The only methods on Float
and not on BigDecimal
appear to be rationalize
, next_float
, and prev_float
. I suspect those methods are rarely used. The increased accuracy seems unlikely to cause code issues for people.
The two largest downsides that I can come up with are speed and display. I'm not sure what kind of hit is taken by handling all decimals as BigDecimal
. Would an average Rails application see a large hit? Additionally, the display value of BigDecimal
is engineering notation. This is also the default produced by to_s
. It's harder to read and might mess up code by displaying things like "0.125e2" instead of "12.5". Certainly the default produced by to_s
could change to the conventional floating point notation.
A change this significant would likely target Ruby 3 so there would be time to make some changes like adding a BigDecimal#rationalize
method or changing the default output of BigDecimal#to_s
.
Thank you for considering this.
Updated by shevegen (Robert A. Heiler) almost 7 years ago
I realize this goes against the 3x3 goal but I think BigDecimal
is preferable over Float for developer happiness.
That's an interesting comment. :)
You could ask matz whether 3x3 is more important than making life
of developers easier. I am not saying that your proposal makes
life easier, mind you; I think that matz probably may prefer
making life easier/better for people using ruby. And the speed
improvements such as 3x3 may come with that secondary focus.
(I have not been following all changes in this respect, but I
think that if mjit comes, then 3x3 will be achieved mostly already
since ruby 2.5.x is already quite a lot faster than 2.0 was).
There are some secondary considerations though. Note that I have
no real pro of con opinion in regards to your proposal, but to
me, "Float" is easier to remember and think about, than
"BigDecimal".
Granted, I do not use it directly (there is no Float.new either),
I only use the numbers directly. But in my mind, I think of 3.5 as
float and never as a "big decimal"; neither when there is a large
float. My mind always thinks of it as ... a float. (Actually, the
name big decimal is also more limited than float, semantic-wise.
It would insinuate a big float number right? Not a small float
necessarily... actually I don't even see BigDecimal ... is it
used anywhere?)
Anyway, I don't want to discourage you in the slightest. I guess
you have to see what matz says on it.
A change this significant would likely target Ruby 3
Agreed. Matz wrote somewhat that backwards-incompatible changes
should go into 3.x preferentially.
For the speed penalty, if there is one, I think it would be nice
if someone could add a table to show the differences (if there
are any).
Updated by sos4nt (Stefan Schüßler) almost 7 years ago
"arbitrary-precision" doesn't mean that BigDecimal
is immune to rounding problems:
a = BigDecimal(1)
b = BigDecimal(3)
(a / b) * b
#=> 0.999999999999999999e0
Updated by AaronLasseigne (Aaron Lasseigne) almost 7 years ago
That's absolutely true. However, it's much less likely and I would say less surprising than the issues you find with a Float
. Switching to use Rational
where possible is an option but felt like a step too far. In short, it's not perfect but I still think BigDecimal
is much friendlier and less prone to user error than Float
.
Updated by chrisseaton (Chris Seaton) almost 7 years ago
In TruffleRuby we represent values that have a single logical class using multiple implementation techniques, transparently to the user. For example for Hash
can be either a linear array of values, or an array of buckets, and Fixnum
can be either an int32
or int64
.
It might be possible to represent a decimal logically as a BigDecimal
but actually use more efficient implementation representations when possible. For example 1.0
could be represented as a float64
, even though we still tell the user that it is a BigDecimal
. An actual BigDecimal
would start to be used when computations are performed that aren't the representable as a float64
, but not if the value is just converted to a String
or something like that.
This would need some proper research to figure out if it's workable and useful.
This kind of mechanism has an overhead, but thankfully it's the kind of thing that is fixed by the JITs that Ruby is developing - the switch between the two representations becomes an inline cache.
Updated by nobu (Nobuyoshi Nakada) almost 7 years ago
You can write "exact" number by 1.2r
.
Updated by sos4nt (Stefan Schüßler) almost 7 years ago
nobu (Nobuyoshi Nakada) wrote:
You can write "exact" number by
1.2r
.
Rational numbers work fine for +
, -
, *
and /
but once you encounter an irrational number, you'll have numerical errors again: ¯\(ツ)/¯
(2.0r ** 0.5r) ** 2.0r
#=> 2.0000000000000004
Updated by nobu (Nobuyoshi Nakada) almost 7 years ago
Rational and BigDecimal do not cover irrational numbers.
require 'bigdecimal'
p (BigDecimal("2.0")**BigDecimal("0.5"))**BigDecimal("2.0")
#=> 0.19999999932878736e1
You'd need a mathematical solver, not a mere numeric class.
Updated by sos4nt (Stefan Schüßler) almost 7 years ago
nobu (Nobuyoshi Nakada) wrote:
Rational and BigDecimal do not cover irrational numbers.
That's exactly what I wanted to say. Changing Float to BigDecimal only solves some problems. It's not a magic bullet. (neither is Rational)
BigDecimal's documentation blames Float for introducing subtle errors, but it has its own issues, even for numbers with a finite decimal representation:
n = 2 ** 128
#=> 340282366920938463463374607431768211456
(BigDecimal(1) / n) * n
#=> #<BigDecimal:7fcfab9a5f10,'0.9999999999 9999999999 9999999999 9999999999 9999999999 9999999999 9999999999 9999999999 9999999914 9294082697 6538413415 6348142057 947136E0',126(153)>
That's neither "very accurate" nor does it qualify as "correct answers people expect".
In order to get the correct result, I have to resort to BigDecimal#div
and provide the number of significant digits manually:
BigDecimal(1).div(n, 91) * n
#=> #<BigDecimal:7fcfaba0d9d0,'0.1E1',9(162)>
Updated by AaronLasseigne (Aaron Lasseigne) over 6 years ago
I don't think anyone is arguing that this fixes everything or is "a magic bullet". I think most developers are familiar with the inaccurate nature of division on computers. Most would expect that "1/3" will be something like "0.33". However, with Float
you end up with errors that are less predictable (like the 129.95 * 100
example I gave above).
My suggestion isn't that we can fix math by using BigDecimal
. It's that BigDecimal
is more developer friendly than Float
and less likely to surprise you. It's also a step that can be taken without causing the major upheaval of a move to something like Rational
.
Updated by duerst (Martin Dürst) over 6 years ago
I think it would be good if some of the proponents of this feature would do a careful speed analysis. My personal guess is that it would get considerably, and unpredictably, slower in many cases. After all, floating point numbers are supported on hardware and limited in size. BigDecimal isn't supported in hardware and isn't limited in size.
I'm not so much afraid about the average slowdown of the "average" Rails application. I'm more concerned about the unintended slowdown of (Rails and other) applications that do significant amounts of calculation, or the occasional and very difficult to diagnose slowdown of applications when they hit specific values. On top of that, I'm also concerned about a possibility of DOS attacks using specific input values that lead to a slowdown.
Updated by yugui (Yuki Sonoda) over 6 years ago
I think most developers are familiar with the inaccurate nature of division on computers. ... However, with Float you end up with errors that are less predictable
These statements sounded weird for me. Why do you think they are not familiar with floating-point values like IEEE754?
If they are not familiar with floating-point values why do you think they are familiar with the inaccuracy? If they are not familiar with floating-point values, why do you think they are familiar with the inaccuracy?
If it is a matter of degree, why do you think it is more predictable than 129.95*100 != 129.95e2
that 1/n*n == 1
depends on the size of n
? I guess they are still not used to 1/n*n != 1
when 1/n
is mathematically a finite decimal unlike 1/3
, though.
In my opinion, consistency is more important for predictability and less surprise. And, IEEE754 or a some common implementation of floating-point values on the platform gives the minimum consistency on which developers rely.
In other words, floating-point values are consistently inaccurate in a well-defined and well-known manner, which makes things predictable.
Updated by mrkn (Kenta Murata) over 6 years ago
As a maintainer of BigDecimal, I don't agree with you about BigDecimal is more developer friendly than Float.
And the current BigDecimal is not better than Rational for representing rational numbers because it has problems in its precision handling by the historical reason, which I'm working to fix.
I recommend using BigDecimal only for the case that needs to represent decimal numbers with the finite number of digits exactly.
Updated by AaronLasseigne (Aaron Lasseigne) over 6 years ago
yugui,
Why do you think they are not familiar with floating-point values like IEEE754?
In my experience, most developers are not intimately familiar with the details of floating point implementations. I have witnessed a number of developers who were surprised by the result of what they though was a straight forward calculation (like my example).
If it is a matter of degree, why do you think it is more predictable than 129.95100 != 129.95e2 that 1/nn == 1 depends on the size of n?
I do think it's a matter of degree. I understand that BigDecimal
isn't perfect but I think it's worth discussing whether it's better than Float
or not. I feel like having to remember that each calculation results in a single stored number and that infinitely long answers (e.g. 1/3) can't be properly stored is easy to remember. Remembering which numbers can and cannot be represented by IEEE754 is much harder. To me, the limitations of BigDecimal
seem easier to reason about than the quirks of Float
.
In my opinion, consistency is more important for predictability and less surprise.
I'll admit that my knowledge of BigDecimal
is limited. Are there consistency issues across platforms? I'll agree that inconsistent results like that might be a big negative to switching.
mrkn,
And the current BigDecimal is not better than Rational for representing rational numbers because it has problems in its precision handling by the historical reason, which I'm working to fix.
I recommend using BigDecimal only for the case that needs to represent decimal numbers with the finite number of digits exactly.
This ticket is not suggesting a switch to Rational
or claiming that BigDecimal
is superior to Rational
. I certainly acknowledge that BigDecimal
has limitations. The question is only whether it's an improvement over Float
.
As a maintainer of BigDecimal, I don't agree with you about BigDecimal is more developer friendly than Float.
Can you explain why? Are there limitations that make it a poor replacement?
Updated by matz (Yukihiro Matsumoto) over 6 years ago
- Status changed from Open to Rejected
Rejected. Unfortunately, the incompatibility this proposal would bring is too big.
Besides that, we have performance concern too.
Matz.