Bug #7984

Severe speed issues in 2.0.0 compiled with Clang

Added by James Coleman about 1 year ago. Updated 7 months ago.

[ruby-core:53025]
Status:Closed
Priority:Normal
Assignee:-
Category:-
Target version:-
ruby -v:2.0.0p0 on darwin Backport:

Description

I've been comparing Ruby 2.0.0 with 1.9.3-head, both compiled with the same basic set of CFLAGs with Clang on Mac OS 10.8 (Mountain Lion). A decent sized rails app was showing performance degradations, so I looked into the most basic of (micro)benchmarks. Running the following:

t = Benchmark.measure do
50000000.times { Object.new }
end

Gives me the following output:

1.9.3-head: 11.240000 0.000000 11.240000 ( 11.247285)
2.0.0p0: 23.610000 0.010000 23.620000 ( 23.629643)

Ruby 2.0.0 takes over twice as long as 1.9.3. In contrast, running the same test under rubies compiled with GCC on linux (unfortunately I can't do a direct test on OSX because 2.0.0 currently won't compile under the current apple gcc 4.2) Ruby 2.0.0 shows a 15-20% speed improvement over 1.9.3.

So it seems there's a severe speed regression interaction with the ruby 2.0.0 code and Clang.

History

#1 Updated by Yui NARUSE about 1 year ago

  • Tracker changed from Backport to Bug
  • Project changed from Backport200 to ruby-trunk
  • Priority changed from High to Normal

jcole1989 (James Coleman) wrote:

unfortunately I can't do a direct test on OSX because 2.0.0 currently won't compile under the current apple gcc 4.2

I think it is not apple gcc 4.2 but apple llvm-gcc 4.2.
And did you install latest Command Line Tools with Xcode 4.6?

#2 Updated by James Coleman about 1 year ago

I have the latest command line tools and Xcode 4.6 installed. The gcc error is known, see: https://bugs.ruby-lang.org/issues/7956 -- the solution was that rvm switched to Clang as the default CC for now.

But all of that still leaves the issue of why the Clang build of 2.0.0 runs so much more slowly than the Clang build of 1.9.3.

#3 Updated by Yui NARUSE about 1 year ago

  • ruby -v set to 2.0.0p0 on darwin

jcole1989 (James Coleman) wrote:

I have the latest command line tools and Xcode 4.6 installed. The gcc error is known, see: https://bugs.ruby-lang.org/issues/7956 -- the solution was that rvm switched to Clang as the default CC for now.

It is not apple gcc 4.2, which is the one usually used on linux, but llvm-gcc.
If you want to use normal gcc, you need to install "apple-gcc42" through brew.

#4 Updated by James Coleman about 1 year ago

naruse (Yui NARUSE) wrote:

It is not apple gcc 4.2, which is the one usually used on linux, but llvm-gcc.
If you want to use normal gcc, you need to install "apple-gcc42" through brew.

I had "apple-gcc42" installed with brew, but I didn't realize that its executable is suffixed with -4.2. I was able to compile 2.0.0 with CC=gcc-4.2. The results would seem to imply that the 2.0.0 speed issue is a problem with Ruby 2.0.0 on OSX in general, not LLVM. The gcc version is faster than the Clang one, but still nowhere near as fast as 1.9.3.

2.0.0p0 (clang): 23.420000 0.010000 23.430000 ( 23.422059)
2.0.0p0 (gcc4.2): 20.270000 0.010000 20.280000 ( 20.273295)
1.9.3p392: 9.460000 0.000000 9.460000 ( 9.465449)

#5 Updated by Eric Hodel about 1 year ago

=begin
I cannot reproduce your result:

$ ruby -rbenchmark -ve 'puts Benchmark.measure { 50000000.times { Object.new } }'
ruby 2.0.0p0 (2013-02-24 revision 39473) [x8664-darwin12.2.1]
10.600000 0.010000 10.610000 ( 10.612499)
$ ruby19 -rbenchmark -ve 'puts Benchmark.measure { 50000000.times { Object.new } }'
ruby 1.9.3p374 (2013-01-15 revision 38858) [x86
64-darwin12.2.1]
11.410000 0.010000 11.420000 ( 11.412127)

Using ministat for a better comparison I see an 8.4% improvement +/- 0.8% with 1/10th the iterations per sample and 20 samples (overall, double the object creation):

$ for f in jot 20; do ruby -rbenchmark -e 'puts Benchmark.measure { 5000000.times { Object.new } }.real'; done > 200p0.txt
$ for f in jot 20; do ruby19 -rbenchmark -e 'puts Benchmark.measure { 5000000.times { Object.new } }.real'; done > 193p374.txt
$ ministat 200p0.txt 193p374.txt
x 200p0.txt
+ 193p374.txt
+--------------------------------------------------------------------------+
| + |
| + + |
| xx + + + |
| xxxx x + + ++ + |
|xx xxxxxx xx x x x ++ + ++ + + + +|
| |_MA| |MA______| |
+--------------------------------------------------------------------------+
N Min Max Median Avg Stddev
x 20 1.031859 1.08182 1.045697 1.0484288 0.012780833
+ 20 1.1223431 1.172935 1.1333108 1.1362227 0.01309569
Difference at 95.0% confidence
0.087794 +/- 0.00828168
8.37386% +/- 0.789914%
(Student's t, pooled s = 0.0129392)

Here are my CC and CFLAGS settings:

$ ruby -rrbconfig -ve 'p RbConfig::CONFIG.values_at(*%w[CC CFLAGS])'
ruby 2.0.0p0 (2013-02-24 revision 39473) [x86_64-darwin12.2.1]
["clang", " -O3 -fno-fast-math -ggdb3 -Wall -Wextra -Wno-unused-parameter -Wno-parentheses -Wno-long-long -Wno-missing-field-initializers -Wunused-variable -Werror=pointer-arith -Werror=write-strings -Werror=declaration-after-statement -Werror=shorten-64-to-32 -Werror=implicit-function-declaration -pipe"]

$ ruby19 -rrbconfig -ve 'p RbConfig::CONFIG.values_at(*%w[CC CFLAGS])'
ruby 1.9.3p374 (2013-01-15 revision 38858) [x86_64-darwin12.2.1]
["clang", " -O3 -ggdb -Wall -Wextra -Wno-unused-parameter -Wno-parentheses -Wno-long-long -Wno-missing-field-initializers -Werror=pointer-arith -Werror=write-strings -Werror=declaration-after-statement -Werror=shorten-64-to-32 -Werror=implicit-function-declaration -pipe"]
=end

#6 Updated by Michael Granger about 1 year ago

=begin
I also see the same thing Eric does, but with 1.9.3p392:

$ ruby -rbenchmark -ve 'puts Benchmark.measure { 50000000.times { Object.new } }'
ruby 2.0.0p0 (2013-02-24 revision 39474) [x86_64-darwin12.2.1]
9.400000 0.100000 9.500000 ( 9.497637)

$ ruby -rbenchmark -ve 'puts Benchmark.measure { 50000000.times { Object.new } }'
ruby 1.9.3p392 (2013-02-22 revision 39386) [x86_64-darwin12.2.1]
10.580000 0.060000 10.640000 ( 10.636003)

=end

#7 Updated by James Coleman about 1 year ago

The only difference in my CFLAGS is that mine also has "-fno-common".

However, I ran:
ruby -rbenchmark -ve 'puts Benchmark.measure { 50000000.times { Object.new } }'
and was unable to reproduce. However, I can reproduce every single time running the same ruby code from within irb. Is there something radically different about how irb runs its interpreter in 2.0.0 versus 1.9.3?

#8 Updated by Eric Hodel about 1 year ago

=begin

Even with irb:

$ irb -rbenchmark
irb(main):001:0> RUBYDESCRIPTION
=> "ruby 2.0.0p0 (2013-02-24 revision 39473) [x86
64-darwin12.2.1]"
irb(main):002:0> Benchmark.measure { 50000000.times { Object.new } }
=> #

$ irb19 -rbenchmark
irb(main):001:0> RUBYDESCRIPTION
=> "ruby 1.9.3p374 (2013-01-15 revision 38858) [x86
64-darwin12.2.1]"
irb(main):002:0> Benchmark.measure { 50000000.times { Object.new } }
=> 14.810000 0.000000 14.810000 ( 14.819111)

Maybe it is due to #7778?
=end

#9 Updated by James Coleman about 1 year ago

I don't have a locale set, so I don't think that that could be the problem.

And even so, how would that affect a tight loop?

#10 Updated by Motohiro KOSAKI about 1 year ago

This is Linux result

% ruby -v test.rb

ruby 1.9.3p392 (2013-02-22 revision 39386) [x86_64-linux]
3.010000 0.000000 3.010000 ( 3.000945)

% ruby -v test.rb
ruby 2.0.0p0 (2013-02-24 revision 39474) [x86_64-linux]
3.470000 0.000000 3.470000 ( 3.462430)

% ruby -v test.rb
ruby 1.9.3p392 (2013-02-22 revision 39386) [x86_64-linux]
3.060000 0.000000 3.060000 ( 3.056820)

Hm. Someone seems made a regression and fixed it.

#11 Updated by Motohiro KOSAKI about 1 year ago

sorry, my previous mail has a typo. I've measured them again.


% ruby -v ../test63.rb

ruby 1.9.3p392 (2013-02-22 revision 39386) [x86_64-linux]
3.460000 0.000000 3.460000 ( 3.458038)

% ruby -v ../test63.rb
ruby 2.0.0p0 (2013-02-24 revision 39474) [x86_64-linux]
3.820000 0.000000 3.820000 ( 3.811076)

% ruby -v ../test63.rb
ruby 2.1.0dev (2013-03-20 trunk 39832) [x86_64-linux]
3.290000 0.000000 3.290000 ( 3.289948)

#12 Updated by James Coleman 7 months ago

With 2.0.0-p247 I get:

11.540000 0.010000 11.550000 ( 11.561841)

So apparently whatever the issue was got fixed either in rvm or in Clang or OSX or in the newer patch level of Ruby.

#13 Updated by Koichi Sasada 7 months ago

  • Status changed from Open to Closed

Also available in: Atom PDF