Feature #1408

0.1.to_r not equal to (1/10)

Added by phasis68 (Heesob Park) about 3 years ago. Updated 9 months ago.

[ruby-core:23318]
Status:Closed Start date:04/26/2009
Priority:Normal Due date:
Assignee:matz (Yukihiro Matsumoto) % Done:

0%

Category:core
Target version:2.0.0

Description

$ ruby -e 'p 0.1.to_r'
(3602879701896397/36028797018963968)

whereas 

$ ruby -e 'p "0.1".to_r'
(1/10)

Related issues

duplicated by ruby-trunk - Bug #5309: 0.6.to_r != "0.6".to_r Rejected 09/13/2011

History

Updated by phasis68 (Heesob Park) about 3 years ago

2009/4/27 Martin DeMello <martindemello@gmail.com>:
> On Sun, Apr 26, 2009 at 2:51 PM, Heesob Park <redmine@ruby-lang.org> wrote:
>>
>> $ ruby -e 'p 0.1.to_r'
>> (3602879701896397/36028797018963968)
>>
>> whereas
>>
>> $ ruby -e 'p "0.1".to_r'
>> (1/10)
>
> What, in theory, could be done about this? By the time to_r is
> invoked, 0.1 is already a binary float, with the implicit rounding
> off.
>
In theory, Float#to_r  could be done through Float#to_s#to_r.

Regards,

Park Heesob

Updated by shyouhei (Shyouhei Urabe) about 3 years ago

Heesob Park wrote:
> 2009/4/27 Martin DeMello <martindemello@gmail.com>:
>> On Sun, Apr 26, 2009 at 2:51 PM, Heesob Park <redmine@ruby-lang.org> wrote:
>>> $ ruby -e 'p 0.1.to_r'
>>> (3602879701896397/36028797018963968)
>>>
>>> whereas
>>>
>>> $ ruby -e 'p "0.1".to_r'
>>> (1/10)
>> What, in theory, could be done about this? By the time to_r is
>> invoked, 0.1 is already a binary float, with the implicit rounding
>> off.
>>
> In theory, Float#to_r  could be done through Float#to_s#to_r.


-1. That loses data.

Updated by rogerdpack (Roger Pack) about 3 years ago

> -1 that loses data.

True--however the (current) code for String#to_s attempts to determine whether the floating point number "is the equivalent default for the rounded value" (i.e. if it round trips).
Do you think that using a comparisong like this (similar to what Park suggested) would be good enough for deducing the true original value? (I've thought of proposing a similar thing for BigDecimal, 

ex: BigDecimal(0.1) => #<BigDecimal:2d8cec8,'0.1E0',4(8)>


-=r

Updated by tadf (tadayoshi funaba) about 3 years ago

to_r should provide exact conversion.
I think ruby may provide "rationalize" on common lisp or scheme.
but not yet.

Updated by nobu (Nobuyoshi Nakada) about 3 years ago

Hi,

At Fri, 1 May 2009 21:12:52 +0900,
Roger Pack wrote in [ruby-core:23345]:
> True--however the (current) code for String#to_s attempts to
> determine whether the floating point number "is the
> equivalent default for the rounded value" (i.e. if it round
> trips).

What about this?


Index: rational.c
===================================================================
--- rational.c	(revision 23433)
+++ rational.c	(working copy)
@@ -1286,4 +1286,5 @@ integer_to_r(VALUE self)
 }

+#if 0
 static void
 float_decode_internal(VALUE self, VALUE *rf, VALUE *rn)
@@ -1299,5 +1300,4 @@ float_decode_internal(VALUE self, VALUE 
 }

-#if 0
 static VALUE
 float_decode(VALUE self)
@@ -1310,11 +1310,82 @@ float_decode(VALUE self)
 #endif

+#if FLT_RADIX == 2 && SIZEOF_BDIGITS * 2 * CHAR_BIT > DBL_MANT_DIG
+# ifdef HAVE_LONG_LONG
+#   define BDIGITDBL2NUM(x) ULL2NUM(x)
+# else
+#   define BDIGITDBL2NUM(x) ULONG2NUM(x) 
+# endif
+#else
+# define NEEDS_FDIV
+static ID id_fdiv;
+fun2(fdiv)
+#endif
+
+static VALUE
+float_r_round(double a, double f, int n)
+{
+    int i, r;
+#ifdef BDIGITDBL2NUM
+    BDIGIT_DBL fn = (BDIGIT_DBL)fabs(f);
+    BDIGIT_DBL d1 = (BDIGIT_DBL)1 << -n, d2 = d1;
+    BDIGIT_DBL rv = d1 % fn;
+    VALUE b, d;
+    if (rv < 10) {
+	for (i = 1, r = (int)rv; i <= r; ++i) {
+	    if ((double)fn / --d2 != a) break;
+	    if (fn % (d1 = d2) == 0) break;
+	}
+    }
+    else if ((rv = fn - rv) && rv < 10) {
+	for (i = 1, r = (int)rv; i <= r; ++i) {
+	    if ((double)fn / ++d2 != a) break;
+	    if (fn % (d1 = d2) == 0) break;
+	}
+    }
+    b = BDIGITDBL2NUM(fn);
+    d = BDIGITDBL2NUM(d1);
+    if (f < 0) b = f_negate(b);
+#else
+    VALUE d2, fn, rv;
+    VALUE b = rb_dbl2big(f);
+    VALUE d = rb_big_pow(rb_uint2big(FLT_RADIX), INT2FIX(-n));
+    if (FIXNUM_P(d)) {
+	d = rb_uint2big(FIX2LONG(d));
+    }
+    d2 = d;
+    fn = f_abs(b);
+    rv = rb_big_modulo(d, fn);
+    if (FIXNUM_P(rv) && (r = FIX2LONG(rv)) < 10) {
+	for (i = 1; i <= r; ++i) {
+	    d2 = f_sub(d2, INT2FIX(1));
+	    if (RFLOAT_VALUE(f_fdiv(fn, d2)) != a) break;
+	    if (f_mod(fn, d = d2) == INT2FIX(0)) break;
+	}
+    }
+    else if (FIXNUM_P(rv = f_sub(fn, rv)) && (r = FIX2LONG(rv)) < 10) {
+	for (i = 1; i <= r; ++i) {
+	    d2 = f_add(d2, INT2FIX(1));
+	    if (RFLOAT_VALUE(f_fdiv(fn, d2)) != a) break;
+	    if (f_mod(fn, d = d2) == INT2FIX(0)) break;
+	}
+    }
+#endif
+    return rb_rational_new(b, d);
+}
+
 static VALUE
 float_to_r(VALUE self)
 {
-    VALUE f, n;
+    double a, f;
+    int n;

-    float_decode_internal(self, &f, &n);
-    return f_mul(f, f_expt(INT2FIX(FLT_RADIX), n));
+    a = RFLOAT_VALUE(self);
+    f = frexp(a, &n);
+    f = ldexp(f, DBL_MANT_DIG);
+    n -= DBL_MANT_DIG;
+    if (n <= DBL_MANT_DIG && f != 0) {
+	return float_r_round(a, f, n);
+    }
+    return f_mul(rb_dbl2big(f), f_expt(INT2FIX(FLT_RADIX), INT2FIX(n)));
 }

@@ -1569,4 +1640,7 @@ Init_Rational(void)
     id_to_s = rb_intern("to_s");
     id_truncate = rb_intern("truncate");
+#ifdef NEEDS_FDIV
+    id_fdiv = rb_intern("fdiv");
+#endif

     ml = (long)(log(DBL_MAX) / log(2.0) - 1);


-- 
Nobu Nakada

Updated by matz (Yukihiro Matsumoto) about 3 years ago

Hi,

In message "Re: [ruby-core:23465] Re: [Feature #1408] 0.1.to_r not equal to  (1/10)"
    on Sat, 16 May 2009 06:23:53 +0900, Nobuyoshi Nakada <nobu@ruby-lang.org> writes:

|What about this?

Could you explain how this patch differs from the original?

							matz.

Updated by nobu (Nobuyoshi Nakada) about 3 years ago

Hi,

At Mon, 18 May 2009 11:15:16 +0900,
Yukihiro Matsumoto wrote in [ruby-core:23487]:
> Could you explain how this patch differs from the original?

Searches more reduceable numerator which can round trip.  Since
it just tries the numerator only in very restricted condtion,
better result may be achieved by trying also the denominator,
in other cases.  In fact, the patch works for very simple
cases, e.g. 0.1 and (1.0/3.0), but doesn't for 0.24.

-- 
Nobu Nakada

Updated by yugui (Yuki Sonoda) almost 3 years ago

  • Target version changed from 1.9.1 to 1.9.2

Updated by marcandre (Marc-Andre Lafortune) over 2 years ago

  • Category set to core
  • Assignee set to matz (Yukihiro Matsumoto)

Updated by marcandre (Marc-Andre Lafortune) over 2 years ago

Sorry to be late to the party on this one.

It is important to remember that a Float is always an approximation.

1.0 has to be understood as 1.0 +/- EPSILON, where the EPSILON is platform dependent. 1.0 is not more equal to 1 than to 1 + EPSILON/2. Indeed, there is no way to distinguish either when they are stored as floats.

To believe that Float#to_s loses data is wrong. If r.to_s returns "1.2", it implies that 1.2 is one of the values in the range of possible values for that floating number. It could have been 1.2000...0006. Or something else. There is no way to know, so #to_s chooses, wisely, to return the simplest value in the range.

There are many rationals that would be encoded as floats the same way. There is no magic way to know that the "exact" value was exactly 12/10 or 5404319552844595/4503599627370496, or anything in between. All have the same representation as a float. There is no reason to believe that the missing (binary) decimals that couldn't be written in space allowed where all 0. Actually, there is reason to believe that they were _probably_ non zero, because fractions that can not be expressed with a finite number of terms in their expansion in a given base all have a recurring expansion. I.e. if the significand does not end with a whole bunch of zeros (rational has finite expansion) then it probably ends with an infinite pattern (say 011011011 in binary, or 333333 in decimal).

For any given float, there is one and only one rational with the smallest denominator that falls in the range of its possible values. It is currently given by Number#rationalize, and I really do not understand why #to_r would return anything else. 

I cannot see any purpose to any other fraction. Moreover, the current algorithm, which returns the middle of the range of possibilities, is platform dependent since the range of possibilities is platform dependent. That makes it even less helpful.

Is there an example where one would want 0.1.to_r to be 3602879701896397/36028797018963968 ?  
Do we really think that 0.1.to_r to be 3602879701896397/36028797018963968 corresponds to the principle of least surprise?
Note that I'm writing that fraction but with a different native double encoding, the fraction would be different.

Updated by znz (Kazuhiro NISHIYAMA) about 2 years ago

  • Status changed from Open to Assigned
  • Target version changed from 1.9.2 to 2.0.0

Updated by marcandre (Marc-Andre Lafortune) about 2 years ago

Why isn't Float#to_r  simply calling Float#rationalize ?

Updated by mwaechter (Matthias Wächter) about 2 years ago

Am 20.09.2009 06:17, schrieb Marc-Andre Lafortune:
> Sorry to be late to the party on this one.

I’m late as well ;)

> It is important to remember that a Float is always an approximation.

No. It is an approximation only for:

• conversion from most decimal numbers, especially floats, and
• calculations that drop digits.

You can do exact math in a limited range of operations, and the question 
should be whether the approximation approach should overrule this exact 
math range of use, especially considering that conversion back to 
decimal _could_ be done precisely, however, sometimes requiring a bunch 
of digits.

> 1.0 has to be understood as 1.0 +/- EPSILON, where the EPSILON is platform
> dependent. 1.0 is not more equal to 1 than to 1 + EPSILON/2. Indeed, there
> is no way to distinguish either when they are stored as floats.

If what’s stored in the Float _is_ your precise result, you certainly 
would not ask for precision reduction just because it _could_ have been 
the result of an imprecise calculation.

> To believe that Float#to_s loses data is wrong.

I think there should be both a Float#to_s and Float#to_nearest_s. The 
first would be precise, the second would output the “shortest” decimal 
representation within ±EPSILON/2.

> If r.to_s returns "1.2", it implies that 1.2 is one of the values in the
> range of possible values for that floating number. It could have been
> 1.2000...0006. Or something else. There is no way to know, so #to_s chooses,
> wisely, to return the simplest value in the range.

This is based on the assumption that no-one would ever care about 
Float’s precision.

> There are many rationals that would be encoded as floats the same way. There
> is no magic way to know that the "exact" value was exactly 12/10 or
> 5404319552844595/4503599627370496, or anything in between. All have the same
> representation as a float. There is no reason to believe that the missing
>(binary) decimals that couldn't be written in space allowed where all 0.
> Actually, there is reason to believe that they were _probably_ non zero,
> because fractions that can not be expressed with a finite number of terms in
> their expansion in a given base all have a recurring expansion. I.e. if the
> significand does not end with a whole bunch of zeros (rational has finite
> expansion) then it probably ends with an infinite pattern (say 011011011 in
> binary, or 333333 in decimal).
>
> For any given float, there is one and only one rational with the smallest
> denominator that falls in the range of its possible values. It is currently
> given by Number#rationalize, and I really do not understand why #to_r would
> return anything else.
>
> I cannot see any purpose to any other fraction. Moreover, the current algorithm,
> which returns the middle of the range of possibilities, is platform dependent
> since the range of possibilities is platform dependent. That makes it even less
> helpful.

> Is there an example where one would want 0.1.to_r to be
> 3602879701896397/36028797018963968 ?

If the binary/Float’s representation of 
3602879701896397/36028797018963968 is the real result of the 
calculation? How do you know?

> Do we really think that 0.1.to_r to be 3602879701896397/36028797018963968
> corresponds to the principle of least surprise?

False assumption here. Using floats for exact decimal math already 
violates POLS. Don’t blame the messenger, i.e. the converter back to 
decimal, the only part of the game that could _always_ be precise.

> Note that I'm writing that fraction but with a different native double
> encoding, the fraction would be different.

Sure. Great to have different levels of precision/imprecision from the 
computers.

And portability is not always the issue, otherwise there would have 
never been different native floating point precisions.

– Matthias

Updated by mwaechter (Matthias Wächter) about 2 years ago

Hello Marc-Andre,

On 19.04.2010 00:14, Marc-Andre Lafortune wrote:
> I hope my dissent will not sound too harsh.

Not at all.

> Arguing that 0.1.to_r should be 3602879701896397/36028797018963968 is
> the same as arguing that 0.1.to_s should outputs these 55 decimals.

Right, that’s my point. 0.1 as a Float has a precise meaning in binary as in decimal, so Float#to_s should keep those 55 decimals. That’s why I said
that Float#to_nearest_s – choose a better name or an option to Fload#to_s – should be created that does »what everyone expects« to_s to do.

The same applies to Float#to_r. It should be as precise as possible, which it is currently. The function that does »what everyone expects« should be
Float#to_nearest_r in the same way as for the string representation.

> For these reasons, the set S is of little interest to anybody.

The problem is that most people think that Floating point arithmetic is precise, which it is only for the the cases I described in my last mail.

> What *is* interesting is the set of real numbers. Floating numbers are
> used to represent them *approximately*. To add to my voice, here are a
> couple of excerpts from the first links that come up on google
> (highlight mine):
> 
> "In computing, floating point describes a system for representing
> numbers that would be too large or too small to be represented as
> integers. Numbers are in general represented *approximately* to a
> fixed number of significant digits and scaled using an exponent."
> http://en.wikipedia.org/wiki/Floating_point
> 
> "Squeezing infinitely many real numbers into a finite number of bits
> requires an *approximate* representation.... Therefore the result of a
> floating-point calculation must often be rounded in order to fit back
> into its finite representation. This rounding error is the
> characteristic feature of floating-point computation."  source:
> http://docs.sun.com/source/806-3568/ncg_goldberg.html

That’s where the problem starts. Everyone thinks he can do exact math on a computer, and the only problem was the approximation of the binary
representation of a real number, characterized by ±EPSILON/2. No, the _real_ issue is the approximation of calculations which not only accumulates
EPSILON with each calculation, but it can shift EPSILON to any order. Think of something trivial like (1E-40+0.1-0.1) returning 0.0 vs.
(1E-40+0.3-0.2-0.1) returning -2.7E-17. There is no real math in floats.

One can go as far as saying that availability of math-like operators and math-like precedence in a programming language supports the expectations of
real-number-like behavior and precision. But this is slightly off-topic, and in fact method calls for simple math are not doing any good to
readability. Math-like operator precedence is different and something completely unnecessary in a programming language, IMHO.

> Note that typing 0.1 in Ruby is a "calculation" which consists in
> finding the member of S closest to 1/10.
> 
> Your final question was: how do I know that the value someone is
> talking about is 0.1 and not
> 0.1000000000000000055511151231257827021181583404541015625 (or
> equivalently 3602879701896397/36028797018963968) ?
> 
> I call it common sense.

It looks so obvious when we are talking about 0.1. If we talk about any other number with 80 digits, my point may become clearer.

What do you do if it’s not 0.1 a.k.a. 0.1000000000000000055511151231257827021181583404541015625 but
0.09999999999999997779553950749686919152736663818359375 (the result of (0.3-0.2)? What’s the difference for your argument? Now we will not get back
the expected nearest 0.1 anyway without applying the actually required/expected rounding constraints.

If it’s just about 0.1.to_r, i.e. converting from a decimal constant number to rational, use String#to_r.

Bottom line: Floats are not exact in terms of math, but they are exact in terms of computer-level implementation, implementing IEEE 754. We should
respect the latter and help people deal with the former.

– Matthias

Updated by tadf (tadayoshi funaba) about 2 years ago

> Why isn't Float#to_r  simply calling Float#rationalize ?

a = 0.5337486539516013
b = 0.5337486539516012

a == b #=> false

a.to_r == a #=> true
a.rationalize == a #=> false

a.to_r == b #=> false
a.rationalize == b #=> true

actually, flonum is restricted rational number.
however, rationalize bends the value.

to_r is the simplest and the cheapest way, rationalize is not so.

moreover, various languages support exact conversion (e.g. CL, Scheme, Haskell, Squeak, Python).

Updated by mrkn (Kenta Murata) about 2 years ago

Float#rationalize is added again at r27503.
Please check that revision.

On 2010/05/06, at 7:23, Marc-Andre Lafortune wrote:

> Maybe a kind Japanese reader can provide the gist of [ruby-dev:41061]
> to explain why was Float#rationalize removed?
> 
> I would also appreciate opinions as to why it wouldn't be a net
> improvement if to_r used the rationalize algorithm and some other
> methods were provided for anyone wanting the value of the
> representation (e.g. Float#representation which would return [sign,
> mantissa, significand] and/or Float#representation_to_r would give the
> rational corresponding to the internal representation of that float)
> 

--
Kenta Murata
OpenPGP FP = FA26 35D7 4F98 3498 0810 E0D5 F213 966F E9EB 0BCC

E-mail: mrkn@mrkn.jp
twitter: http://twitter.com/mrkn/
blog: http://d.hatena.ne.jp/mrkn/

Updated by mrkn (Kenta Murata) 9 months ago

  • Status changed from Assigned to Closed
I close this ticket because the topic was too diverged. Would you please make new tickets for the new version of ruby if anyone has objections.

Also available in: Atom PDF