Project

General

Profile

Feature #12180

switch id_table.c variant

Added by funny_falcon (Yura Sokolov) over 1 year ago. Updated 10 months ago.

Status:
Closed
Priority:
Normal
Target version:
-
[ruby-core:74349]

Description

Currently used variant is 'binary search in small table + hash for large tables'.
But for contemporary CPU it may be better to do linear scan for small tables.

It is already implemented in id_table.c and numbered as 35.

Tested with simple Redmine installation on Intel Haswell i7-4770 CPU @ 3.40GHz

  • trunk: Requests per second: 27.79 [#/sec] (mean)
  • with switched implementation: Requests per second: 28.87 [#/sec] (mean)
0001-id_table.c-switch-id_table-variant.patch (767 Bytes) 0001-id_table.c-switch-id_table-variant.patch funny_falcon (Yura Sokolov), 03/15/2016 06:51 PM

Associated revisions

Revision 57416
Added by ko1 (Koichi Sasada) 10 months ago

swithc id_table data structure.

  • id_table.c: swtich to "simple open addressing with quadratic probing" by Yura Sokolov. For more detail measurements, see [Feature #12180]
  • id_table.c: remove other algorithms to simplify the source code.

History

#1 [ruby-core:74350] Updated by funny_falcon (Yura Sokolov) over 1 year ago

Well, in fact '22' implementation (simple open addressing with quadratic probing) is even faster:

Requests per second: 29.67 [#/sec] (mean)

And it uses just 0.5-1% more memory.

#2 [ruby-core:74360] Updated by ko1 (Koichi Sasada) over 1 year ago

Hi,

simple Redmine installation

What is this benchmarking?

Thanks,
Koichi

#3 [ruby-core:74362] Updated by funny_falcon (Yura Sokolov) over 1 year ago

I just install Redmine - the same software that powers bugs.ruby-lang.org ,
then add several test issues, and bench it with 'apache benchmark':
ab -n 1000 -c 10 http://localhost:3000/projects/general/issues
(general is a name of test project)

#4 [ruby-core:74387] Updated by funny_falcon (Yura Sokolov) over 1 year ago

In other words, big web applications are too big for global method cache, and too polymorphic for inline cache.
So general method lookup is inevitable.
With current choice for id_table implementation (34) it takes about 4.5-6%CPU.
With implementation 22 it takes just 2.5-3% CPU.

#5 [ruby-core:77514] Updated by funny_falcon (Yura Sokolov) about 1 year ago

Please, reconsider benchmarking it with realworld applications.

#6 [ruby-core:78045] Updated by ko1 (Koichi Sasada) about 1 year ago

I measured them like st_table:
https://gist.github.com/ko1/b8714907c328bc29ba9501cfcb1abc8a

And ffalcon's 22 implementation is best. We need to switch to it.

Naruse-san, can I switch it now?

#7 [ruby-core:78046] Updated by ko1 (Koichi Sasada) about 1 year ago

Yura: can you make a table more small with a few entries?

And do you have a commit permission?

#8 [ruby-core:78773] Updated by hsbt (Hiroshi SHIBATA) 11 months ago

I asked this feature to ko1. He said that it's better to merge at Ruby 2.5.

We will evaluate this after Ruby 2.4.0 release.

#9 [ruby-core:79129] Updated by shyouhei (Shyouhei Urabe) 10 months ago

  • Assignee set to ko1 (Koichi Sasada)
  • Status changed from Open to Assigned

#10 [ruby-core:79171] Updated by shyouhei (Shyouhei Urabe) 10 months ago

We looked at this issue in yesterday's developer meeting.

(As ko1 measured before,) we confirmed that ffalcon's implementation is the fastest. Ko1 is assigned to switch to that.

#11 [ruby-core:79200] Updated by funny_falcon (Yura Sokolov) 10 months ago

Excuse me for not reacting on discussion.
I forgot to turn on mail notifications for this issue.

can you make a table more small with a few entries?

It could be 25% smaller for any size (on x86_64) at cost of second memory lookup
if key and value stored in separate arrays. It should be measured.
Single pointer still could be used as with "USE_CALC_VALUES".

ko1, will you just switch id or remove other implementations?
if first, I may prepare patch for current source, otherwise I will wait for commit.

And do you have a commit permission?

No I haven't. And I doubt I should have.

#12 [ruby-core:79202] Updated by funny_falcon (Yura Sokolov) 10 months ago

Oh, to reduce hash size either max serial_id should be 0x7fffffff instead of 0xffffffff
(cause 1 bit is stolen for collision check),
or collision bit removed (it will increase time for "miss" and decrease for "hit").

#13 Updated by ko1 (Koichi Sasada) 10 months ago

  • Status changed from Assigned to Closed

Applied in changeset r57416.


swithc id_table data structure.

  • id_table.c: swtich to "simple open addressing with quadratic probing" by Yura Sokolov. For more detail measurements, see [Feature #12180]
  • id_table.c: remove other algorithms to simplify the source code.

#14 [ruby-core:79258] Updated by ko1 (Koichi Sasada) 10 months ago

Yura:
Sorry for my laziness. I committed it (remain only your algorithm) and cleanup source code.
Pls make another ticket to improve id_table structure.

#15 [ruby-core:79367] Updated by funny_falcon (Yura Sokolov) 10 months ago

Created https://bugs.ruby-lang.org/issues/13174 with implementation smaller in memory.

Also available in: Atom PDF