Feature #12180
closedswitch id_table.c variant
Description
Currently used variant is 'binary search in small table + hash for large tables'.
But for contemporary CPU it may be better to do linear scan for small tables.
It is already implemented in id_table.c
and numbered as 35.
Tested with simple Redmine installation on Intel Haswell i7-4770 CPU @ 3.40GHz
- trunk:
Requests per second: 27.79 [#/sec] (mean)
- with switched implementation:
Requests per second: 28.87 [#/sec] (mean)
Files
Updated by funny_falcon (Yura Sokolov) over 8 years ago
Well, in fact '22' implementation (simple open addressing with quadratic probing
) is even faster:
Requests per second: 29.67 [#/sec] (mean)
And it uses just 0.5-1% more memory.
Updated by ko1 (Koichi Sasada) over 8 years ago
Hi,
simple Redmine installation
What is this benchmarking?
Thanks,
Koichi
Updated by funny_falcon (Yura Sokolov) over 8 years ago
I just install Redmine - the same software that powers bugs.ruby-lang.org ,
then add several test issues, and bench it with 'apache benchmark':
ab -n 1000 -c 10 http://localhost:3000/projects/general/issues
(general
is a name of test project)
Updated by funny_falcon (Yura Sokolov) over 8 years ago
In other words, big web applications are too big for global method cache, and too polymorphic for inline cache.
So general method lookup is inevitable.
With current choice for id_table implementation (34) it takes about 4.5-6%CPU.
With implementation 22 it takes just 2.5-3% CPU.
Updated by funny_falcon (Yura Sokolov) about 8 years ago
Please, reconsider benchmarking it with realworld applications.
Updated by ko1 (Koichi Sasada) about 8 years ago
I measured them like st_table:
https://gist.github.com/ko1/b8714907c328bc29ba9501cfcb1abc8a
And ffalcon's 22 implementation is best. We need to switch to it.
Naruse-san, can I switch it now?
Updated by ko1 (Koichi Sasada) about 8 years ago
Yura: can you make a table more small with a few entries?
And do you have a commit permission?
Updated by hsbt (Hiroshi SHIBATA) almost 8 years ago
I asked this feature to ko1. He said that it's better to merge at Ruby 2.5.
We will evaluate this after Ruby 2.4.0 release.
Updated by shyouhei (Shyouhei Urabe) almost 8 years ago
- Status changed from Open to Assigned
- Assignee set to ko1 (Koichi Sasada)
Updated by shyouhei (Shyouhei Urabe) almost 8 years ago
We looked at this issue in yesterday's developer meeting.
(As ko1 measured before,) we confirmed that ffalcon's implementation is the fastest. Ko1 is assigned to switch to that.
Updated by funny_falcon (Yura Sokolov) almost 8 years ago
Excuse me for not reacting on discussion.
I forgot to turn on mail notifications for this issue.
can you make a table more small with a few entries?
It could be 25% smaller for any size (on x86_64) at cost of second memory lookup
if key and value stored in separate arrays. It should be measured.
Single pointer still could be used as with "USE_CALC_VALUES".
ko1, will you just switch id or remove other implementations?
if first, I may prepare patch for current source, otherwise I will wait for commit.
And do you have a commit permission?
No I haven't. And I doubt I should have.
Updated by funny_falcon (Yura Sokolov) almost 8 years ago
Oh, to reduce hash size either max serial_id should be 0x7fffffff instead of 0xffffffff
(cause 1 bit is stolen for collision check),
or collision bit removed (it will increase time for "miss" and decrease for "hit").
Updated by ko1 (Koichi Sasada) almost 8 years ago
- Status changed from Assigned to Closed
Applied in changeset r57416.
swithc id_table data structure.
- id_table.c: swtich to "simple open addressing with quadratic probing"
by Yura Sokolov. For more detail measurements, see [Feature #12180] - id_table.c: remove other algorithms to simplify the source code.
Updated by ko1 (Koichi Sasada) almost 8 years ago
Yura:
Sorry for my laziness. I committed it (remain only your algorithm) and cleanup source code.
Pls make another ticket to improve id_table structure.
Updated by funny_falcon (Yura Sokolov) almost 8 years ago
Created https://bugs.ruby-lang.org/issues/13174 with implementation smaller in memory.