Array#pack("G") gives incorrect result sometimes. The problem occurs with:
- all Ruby versions I have tried: 2.0.0, 2.3.0 and Subversion-trunk (all built from source)
- only with 32-bit builds (not 64-bit)
- only with GCC (not CLANG)
- with GCC 4.8.5 (from Ubuntu 14.04) and also with 5.3.1 (from Ubuntu 16.04 beta)
- also with a Ruby 2.0.0 on Windows (installed with RubyInstaller I think)
The function failing is "HTOND" (it is really a macro) in "pack.c" in the function "pack_value".
When it is called, the macro expands to (reformatted for easy reading):
d = (memcpy(&(dtmp),&(d),sizeof(double)), (dtmp) = (0?((uint64_t)(dtmp)):__builtin_bswap64((uint64_t)(dtmp))), memcpy(&(d),&(dtmp),sizeof(double)), (d));
If I brake this complicated expression apart manually into two statements and rebuild ruby with this code:
memcpy(&(dtmp),&(d),sizeof(double)); ((dtmp) = (0?((uint64_t)(dtmp)):__builtin_bswap64((uint64_t)(dtmp))), memcpy(&(d),&(dtmp),sizeof(double)), (d));
the problem goes away. My conclusion is that GCC is too "aggressive" when optimizing the code as it looks
in the Ruby source, and my change stops GCC from doing that. As mentioned above, Ruby built with CLANG
doesn't seem to have this problem (Ubuntu 16.04 beta + 32-bit Ruby + Clang 3.8). And if I build "pack.c" with GCC
using "-O0" instead of "-O3" the probolem also seem to go away.
A simple example to demonstrate the problem is:
f = 2.769637943971985816 puts 'pack("D") failed' if [f].pack("D").unpack("D") != f puts 'pack("G") failed' if [f].pack("G").unpack("G") != f # output: pack("G") failed # wrong bit pattern: 40062837f038fe7f # correct bit pattern: 40062837f038f67f
Even if this turns out to be a pure GCC bug (I'm not absolutely certain about this),
it becomes a Ruby problem too since it seem to exist in all 32-bit Rubys I have tried.
Updated by naruse (Yui NARUSE) about 5 years ago
- Status changed from Open to Feedback
It looks because of x87 FPU's internal calculation behavior.
x87 always calculates floating points with 80bit precision on their register, even if the number is 64bit double.
And only when the number is assigned to a 64bit double variable, the number is rounded into 64bit.
It seems the reason why when you use ';' instead of ',', the issue disappear.
You may force FPU to always round in 64bit by following ways:
(1) assign the number to a variable
(2) specify -ffloat-store for gcc/clang, -Op for VC6, -fp:precise for VC8+.
(3) specify -msse2 -mfpmath=sse or similar one
(1) is the one Vit suggested and marcandre commited.
It is still problematic because a smart C compiler will optimize out a variable.
Failed test results I showed in are because of it.
So you must add volatile to guard the variable from such optimization.
But it is barrier for optimization on non x87 FPU.
(2) is not so good because it effects above volatile effect all over the code.
It degrade the speed.
(3) uses SSE2 to calculate floating point numbers.
This doesn't have such speed penalty, but it is less portable for CPUs.
Updated by johan556 (Johan Holmberg) about 5 years ago
I am aware of the x87 vs. SSE2 differences, but I don't think this issue has anything to do with that.
Note that my example demonstrated that a given Float, when processed by pack("G") + unpack("G") is changed into another bit pattern. No floating point calculation is involved. Also note that when pack("D") + unpack("D") is used, I get back the bit pattern I started with. That should have happened with "G" too.
The difference between these two cases is that conversions with "D" uses "native format", but "G" uses "network (big-endian) byte order". And for "network byte order", the macro HTOND used in "pack.c" was supposed to rearrange the bytes in the correct order.
In both 32-bit and 64-bit Ruby, a Float is a "double" at C level and has size 64-bit.
Updated by kazuho (Kazuho Oku) about 5 years ago
This is obviously a bug in the Ruby-side.
The original code, after swapping the bytes of a double, assigns it to a double.
d = (..., d);
At this point, bit pattern of the right-hand expression ends with f67f, which is a signalling NAN for x87.
However, there is no guarantee that the bit pattern of a NAN is retained after being assigned. In this case, I assume x87 is converting the value to quiet NAN, as discussed in http://stackoverflow.com/questions/22816095/signalling-nan-was-corrupted-when-returning-from-x86-function-flds-fstps-of-x87.
That is why the bit pattern changes to fe7f.
Updated by naruse (Yui NARUSE) almost 5 years ago
- Status changed from Feedback to Closed
Applied in changeset r55573.
pack.c (pack_pack): use union instead of bare variable to ease
optimizations and avoid assigning x87 floating point number.
[ruby-core:74496] [Bug #12209]
pack.c (pack_unpack): ditto.