Feature #13368
closedImprove performance of Array#sum with float elements
Description
The declaration of local variable in loop, it will initialize local variable for each run of the loop with clang generated code.
So, it shouldn't declare the local variable in heavy loop.
Array#sum with float elements will be faster around 30%.
Before¶
user system total real
3.320000 0.010000 3.330000 ( 3.336088)
After¶
user system total real
2.590000 0.010000 2.600000 ( 2.602399)
Test code¶
require 'benchmark'
Benchmark.bmbm do |x|
ary = []
10000.times { ary << Random.rand }
x.report do
50000.times do
ary.sum
end
end
end
Patch¶
Updated by normalperson (Eric Wong) almost 8 years ago
+Cc ruby-core since the original post was English
watson1978@gmail.com wrote:
Issue #13368 has been reported by watson1978 (Shizuo Fujita).
Bug #13368: Improve performance of Array#sum with float elements
https://bugs.ruby-lang.org/issues/13368
The declaration of local variable in loop, it will initialize local variable for each run of the loop with clang generated code.
So, it shouldn't declare the local variable in heavy loop.
I disagree with this change. Different compilers optimize
differently.
Patch¶
Disclaimer: I used commit 3635f883096604d0e6453dc9d2484d5c92467109
after having:
fetch = +refs/pull/*:refs/remotes/ruby/pull/*
In the [remote ] section of my .git/config and did not
use any proprietary JavaScript or browser to apply your change.
Updated by normalperson (Eric Wong) almost 8 years ago
+Cc ruby-core since the original post was English
watson1978@gmail.com wrote:
Issue #13368 has been reported by watson1978 (Shizuo Fujita).
Bug #13368: Improve performance of Array#sum with float elements
https://bugs.ruby-lang.org/issues/13368
The declaration of local variable in loop, it will initialize local variable for each run of the loop with clang generated code.
So, it shouldn't declare the local variable in heavy loop.
I disagree with this change. Different compilers optimize
differently.
Patch¶
Disclaimer: I used commit 3635f883096604d0e6453dc9d2484d5c92467109
after having:
fetch = +refs/pull/*:refs/remotes/ruby/pull/*
In the [remote ] section of my .git/config and did not
use any proprietary JavaScript or browser to apply your change.
Updated by watson1978 (Shizuo Fujita) almost 8 years ago
When I filed this ticket, I tried to run benchmark on macOS + clang only.
Then, I tried to do on 2 environments in additional.
I found what my patch is effective with clang envrionment only (such as macOS or FreeBSD which use clang as default compiler).
macOS 10.12 + gcc 6.3.0¶
Before¶
user system total real
2.750000 0.000000 2.750000 ( 2.757864)
After¶
user system total real
2.730000 0.010000 2.740000 ( 2.740204)
Ubuntu 16.04.4 + gcc 5.4.0¶
Before¶
user system total real
2.400000 0.000000 2.400000 ( 2.395856)
After¶
user system total real
2.330000 0.000000 2.330000 ( 2.327889)
Updated by naruse (Yui NARUSE) almost 8 years ago
watson1978 (Shizuo Fujita) wrote:
When I filed this ticket, I tried to run benchmark on macOS + clang only.
Then, I tried to do on 2 environments in additional.I found what my patch is effective with clang envrionment only (such as macOS or FreeBSD which use clang as default compiler).
Hmm, the result shows clang is not efficient as gcc yet this time.
And improvement should be done by clang, not us.
Or should be done by something cleaner patch.
Updated by watson1978 (Shizuo Fujita) almost 8 years ago
naruse (Yui NARUSE) wrote:
Hmm, the result shows clang is not efficient as gcc yet this time.
And improvement should be done by clang, not us.
Or should be done by something cleaner patch.
Could you report this to clang developer If you don't want to fix this ?
Updated by naruse (Yui NARUSE) almost 8 years ago
watson1978 (Shizuo Fujita) wrote:
naruse (Yui NARUSE) wrote:
Hmm, the result shows clang is not efficient as gcc yet this time.
And improvement should be done by clang, not us.
Or should be done by something cleaner patch.Could you report this to clang developer If you don't want to fix this ?
Though I reported to clang when it breaks build like https://bugs.llvm.org//show_bug.cgi?id=8319
Clang is considered still development at performance like other inline threshold ones.
They should improve more general cases.
Updated by jeremyevans0 (Jeremy Evans) over 5 years ago
- Tracker changed from Bug to Feature
- Backport deleted (
2.2: UNKNOWN, 2.3: UNKNOWN, 2.4: UNKNOWN)
Updated by hsbt (Hiroshi SHIBATA) about 4 years ago
- Status changed from Open to Closed
- Assignee set to mrkn (Kenta Murata)