https://bugs.ruby-lang.org/
https://bugs.ruby-lang.org/favicon.ico?1711330511
2015-08-06T08:56:24Z
Ruby Issue Tracking System
Ruby master - Feature #11420: Introduce ID key table into MRI
https://bugs.ruby-lang.org/issues/11420?journal_id=53677
2015-08-06T08:56:24Z
ko1 (Koichi Sasada)
<ul><li><strong>Related to</strong> <i><a class="issue tracker-2 status-5 priority-4 priority-default closed" href="/issues/6962">Feature #6962</a>: Use lighter hash structure for methods table, instance variable positions, constants</i> added</li></ul>
Ruby master - Feature #11420: Introduce ID key table into MRI
https://bugs.ruby-lang.org/issues/11420?journal_id=53679
2015-08-06T08:56:33Z
ko1 (Koichi Sasada)
<ul><li><strong>Related to</strong> <i><a class="issue tracker-2 status-5 priority-4 priority-default closed" href="/issues/11414">Feature #11414</a>: Relax ID table ordering</i> added</li></ul>
Ruby master - Feature #11420: Introduce ID key table into MRI
https://bugs.ruby-lang.org/issues/11420?journal_id=53681
2015-08-06T12:08:51Z
matz (Yukihiro Matsumoto)
matz@ruby.or.jp
<ul></ul><p>Go ahead and experiment the idea.</p>
<p>Matz.</p>
Ruby master - Feature #11420: Introduce ID key table into MRI
https://bugs.ruby-lang.org/issues/11420?journal_id=53682
2015-08-06T13:12:31Z
ngoto (Naohisa Goto)
ngotogenome@gmail.com
<ul><li><strong>Related to</strong> <i><a class="issue tracker-2 status-6 priority-4 priority-default closed" href="/issues/9638">Feature #9638</a>: [PATCH] limit IDs to 32-bits on 64-bit systems</i> added</li></ul>
Ruby master - Feature #11420: Introduce ID key table into MRI
https://bugs.ruby-lang.org/issues/11420?journal_id=53684
2015-08-06T13:42:00Z
ngoto (Naohisa Goto)
ngotogenome@gmail.com
<ul></ul><p>Indeed, I'm using machines with 2TB or 3TB main memory, and<br>
theoretically the upper limit of 1,374,389,534,720 = 1.3TB<br>
can be reached today.<br>
(though this may be very rare case in practice)</p>
<p>I think to prepare a compile-time option to extend ID bits is enough for now.</p>
<blockquote>
<p>We need at least (8+40)B per ID, so 536,870,912 IDs consumes 171,798,691,840 = 171GB.</p>
</blockquote>
<p>Is "(8+40)B" a typo of (8*40)B ?</p>
Ruby master - Feature #11420: Introduce ID key table into MRI
https://bugs.ruby-lang.org/issues/11420?journal_id=53685
2015-08-06T14:02:16Z
funny_falcon (Yura Sokolov)
funny.falcon@gmail.com
<ul></ul><p>Koichi Sasada , i've made another one "hash" for your experiments<br>
<a href="https://github.com/ko1/ruby/pull/1" class="external">https://github.com/ko1/ruby/pull/1</a></p>
Ruby master - Feature #11420: Introduce ID key table into MRI
https://bugs.ruby-lang.org/issues/11420?journal_id=53687
2015-08-06T18:18:31Z
ko1 (Koichi Sasada)
<ul></ul><p>On 2015/08/06 23:02, <a href="mailto:funny.falcon@gmail.com" class="email">funny.falcon@gmail.com</a> wrote:</p>
<blockquote>
<p>Koichi Sasada , i've made another one "hash" for your experiments</p>
</blockquote>
<p>Thank you!<br>
Which implementation do you like?</p>
<p>--<br>
// SASADA Koichi at atdot dot net</p>
Ruby master - Feature #11420: Introduce ID key table into MRI
https://bugs.ruby-lang.org/issues/11420?journal_id=53688
2015-08-06T18:54:37Z
funny_falcon (Yura Sokolov)
funny.falcon@gmail.com
<ul></ul><blockquote>
<p>Which implementation do you like?</p>
</blockquote>
<p>The one which will be faster.</p>
<p>Quadratic probing is simpler, so if it is not slower (or with in couple of percents)<br>
than coalesced chaining in usual application (big rails application :) ), then it is<br>
certainly preferred.</p>
<p>If per-class method cache were merged/implemented, then sorted array will be enough<br>
(probably).</p>
<p>--<br>
Sokolov Yura aka funny_falcon</p>
Ruby master - Feature #11420: Introduce ID key table into MRI
https://bugs.ruby-lang.org/issues/11420?journal_id=53750
2015-08-11T13:02:37Z
ko1 (Koichi Sasada)
<ul></ul><p>I added `mix' data structure.<br>
<a href="https://github.com/ko1/ruby/blob/f965b9bb3fc42cebb0dc30461a44bcf8fb550452/id_table.c" class="external">https://github.com/ko1/ruby/blob/f965b9bb3fc42cebb0dc30461a44bcf8fb550452/id_table.c</a></p>
<p>I measured another script.</p>
<pre><code class="ruby syntaxhl" data-language="ruby"><span class="vg">$classes</span> <span class="o">=</span> <span class="p">[]</span>
<span class="no">M</span> <span class="o">=</span> <span class="mi">40</span>
<span class="mi">10_000</span><span class="p">.</span><span class="nf">times</span><span class="p">{</span><span class="o">|</span><span class="n">i</span><span class="o">|</span>
<span class="n">defs</span> <span class="o">=</span> <span class="p">(</span><span class="mi">1</span><span class="o">..</span><span class="no">M</span><span class="p">).</span><span class="nf">map</span><span class="p">{</span><span class="o">|</span><span class="n">e</span><span class="o">|</span>
<span class="s2">"def m</span><span class="si">#{</span><span class="n">e</span><span class="si">}</span><span class="s2">; end"</span>
<span class="p">}.</span><span class="nf">join</span><span class="p">(</span><span class="s2">"</span><span class="se">\n</span><span class="s2">"</span><span class="p">)</span>
<span class="nb">eval</span> <span class="o"><<-</span><span class="no">EOS</span><span class="sh">
class C</span><span class="si">#{</span><span class="n">i</span><span class="si">}</span><span class="sh">
$classes << self
</span><span class="si">#{</span><span class="n">defs</span><span class="si">}</span><span class="sh">
end
</span><span class="no"> EOS</span>
<span class="p">}</span>
<span class="nb">require</span> <span class="s1">'objspace'</span>
<span class="nb">p</span> <span class="vg">$classes</span><span class="p">.</span><span class="nf">reduce</span><span class="p">(</span><span class="mi">0</span><span class="p">){</span><span class="o">|</span><span class="n">r</span><span class="p">,</span> <span class="n">klass</span><span class="o">|</span> <span class="n">r</span> <span class="o">+</span> <span class="no">ObjectSpace</span><span class="p">.</span><span class="nf">memsize_of</span><span class="p">(</span><span class="n">klass</span><span class="p">)}</span>
</code></pre>
<pre><code> method# (M)
IMPL# 0 5 20 40 80
1:st 7,040,000 7,040,000 16,480,000 26,080,000 45,280,000
11:list 5,280,000 6,560,000 10,400,000 15,520,000 25,760,000
12:list 5,280,000 6,240,000 9,120,000 12,960,000 20,640,000
21:hash 5,360,000 6,640,000 10,480,000 15,600,000 25,840,000
22:hash 5,360,000 6,640,000 10,480,000 15,600,000 25,840,000
31:mix 5,360,000 6,320,000 9,200,000 25,840,000 25,840,000
(Bytes)
</code></pre>
<p>Basically, we don't need to care about size for practical cases (small amount of methods) :p<br>
I will commit them soon.</p>
Ruby master - Feature #11420: Introduce ID key table into MRI
https://bugs.ruby-lang.org/issues/11420?journal_id=53751
2015-08-11T13:11:42Z
ko1 (Koichi Sasada)
<ul><li><strong>File</strong> <a href="/attachments/5446">measurement2.png</a> <a class="icon-only icon-download" title="Download" href="/attachments/download/5446/measurement2.png">measurement2.png</a> added</li></ul><p><img src="https://bugs.ruby-lang.org/attachments/download/5446/measurement2.png" alt="2nd measurement" loading="lazy"></p>
Ruby master - Feature #11420: Introduce ID key table into MRI
https://bugs.ruby-lang.org/issues/11420?journal_id=53757
2015-08-11T16:20:25Z
funny_falcon (Yura Sokolov)
funny.falcon@gmail.com
<ul></ul><p>Koichi Sasada, did you measure performance of mix approach compared to hash?<br>
Perhaps, threshold should be set to lower value than 32.</p>
Ruby master - Feature #11420: Introduce ID key table into MRI
https://bugs.ruby-lang.org/issues/11420?journal_id=53758
2015-08-11T20:16:45Z
ko1 (Koichi Sasada)
<ul><li><strong>File</strong> <a href="/attachments/5447">microbenchmark.pdf</a> <a class="icon-only icon-download" title="Download" href="/attachments/download/5447/microbenchmark.pdf">microbenchmark.pdf</a> added</li></ul><blockquote>
<p>Koichi Sasada, did you measure performance of mix approach compared to hash?</p>
</blockquote>
<p>I made micro benchmark set.</p>
<pre><code class="ruby syntaxhl" data-language="ruby"><span class="nb">require</span> <span class="s1">'objspace'</span>
<span class="nb">require</span> <span class="s1">'benchmark'</span>
<span class="vg">$results</span> <span class="o">=</span> <span class="p">[]</span>
<span class="n">insert_perf</span> <span class="o">=</span> <span class="p">[]</span>
<span class="n">hit_perf1</span> <span class="o">=</span> <span class="p">[]</span>
<span class="n">hit_perf2</span> <span class="o">=</span> <span class="p">[]</span>
<span class="n">miss_perf1</span> <span class="o">=</span> <span class="p">[]</span>
<span class="n">miss_perf2</span> <span class="o">=</span> <span class="p">[]</span>
<span class="nb">srand</span><span class="p">(</span><span class="mi">0</span><span class="p">)</span>
<span class="mi">0</span><span class="p">.</span><span class="nf">step</span><span class="p">(</span><span class="ss">to: </span><span class="mi">100</span><span class="p">,</span> <span class="ss">by: </span><span class="mi">10</span><span class="p">){</span><span class="o">|</span><span class="n">m</span><span class="o">|</span>
<span class="n">classes</span> <span class="o">=</span> <span class="p">[]</span>
<span class="n">defs</span> <span class="o">=</span> <span class="p">(</span><span class="mi">1</span><span class="o">..</span><span class="n">m</span><span class="p">).</span><span class="nf">map</span><span class="p">{</span><span class="o">|</span><span class="n">e</span><span class="o">|</span>
<span class="s2">"def m</span><span class="si">#{</span><span class="n">e</span><span class="si">}</span><span class="s2">; end"</span>
<span class="p">}.</span><span class="nf">sort_by</span><span class="p">{</span><span class="nb">rand</span><span class="p">}.</span><span class="nf">join</span><span class="p">(</span><span class="s2">"</span><span class="se">\n</span><span class="s2">"</span><span class="p">)</span>
<span class="n">insert_perf</span> <span class="o"><<</span> <span class="no">Benchmark</span><span class="p">.</span><span class="nf">measure</span><span class="p">{</span>
<span class="mi">10_000</span><span class="p">.</span><span class="nf">times</span><span class="p">{</span><span class="o">|</span><span class="n">i</span><span class="o">|</span>
<span class="n">classes</span> <span class="o"><<</span> <span class="no">Class</span><span class="p">.</span><span class="nf">new</span><span class="p">{</span>
<span class="nb">eval</span><span class="p">(</span><span class="n">defs</span><span class="p">)</span>
<span class="k">def</span> <span class="nf">method_missing</span> <span class="n">mid</span>
<span class="k">end</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="p">}.</span><span class="nf">real</span>
<span class="n">objs</span> <span class="o">=</span> <span class="n">classes</span><span class="p">.</span><span class="nf">map</span><span class="p">{</span><span class="o">|</span><span class="n">klass</span><span class="o">|</span> <span class="n">klass</span><span class="p">.</span><span class="nf">new</span><span class="p">}</span>
<span class="n">ms</span> <span class="o">=</span> <span class="p">(</span><span class="mi">1</span><span class="o">..</span><span class="n">m</span><span class="p">).</span><span class="nf">map</span><span class="p">{</span><span class="o">|</span><span class="n">i</span><span class="o">|</span> <span class="s2">"m</span><span class="si">#{</span><span class="n">i</span><span class="si">}</span><span class="s2">"</span><span class="p">.</span><span class="nf">to_sym</span><span class="p">}</span>
<span class="n">miss_ms</span> <span class="o">=</span> <span class="p">(</span><span class="mi">1</span><span class="o">..</span><span class="n">m</span><span class="p">).</span><span class="nf">map</span><span class="p">{</span><span class="o">|</span><span class="n">i</span><span class="o">|</span> <span class="s2">"miss_m</span><span class="si">#{</span><span class="n">i</span><span class="si">}</span><span class="s2">"</span><span class="p">.</span><span class="nf">to_sym</span><span class="p">}</span>
<span class="n">hit_perf1</span> <span class="o"><<</span> <span class="no">Benchmark</span><span class="p">.</span><span class="nf">measure</span><span class="p">{</span>
<span class="mi">10</span><span class="p">.</span><span class="nf">times</span><span class="p">{</span>
<span class="n">objs</span><span class="p">.</span><span class="nf">each</span><span class="p">{</span><span class="o">|</span><span class="n">obj</span><span class="o">|</span>
<span class="n">n</span> <span class="o">=</span> <span class="nb">rand</span><span class="p">(</span><span class="n">m</span><span class="p">)</span>
<span class="n">obj</span><span class="p">.</span><span class="nf">send</span><span class="p">(</span><span class="n">ms</span><span class="p">[</span><span class="n">n</span><span class="p">])</span> <span class="k">if</span> <span class="n">m</span> <span class="o">></span> <span class="mi">0</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="p">}.</span><span class="nf">real</span>
<span class="n">hit_perf2</span> <span class="o"><<</span> <span class="no">Benchmark</span><span class="p">.</span><span class="nf">measure</span><span class="p">{</span>
<span class="n">objs</span><span class="p">.</span><span class="nf">each</span><span class="p">{</span><span class="o">|</span><span class="n">obj</span><span class="o">|</span>
<span class="mi">10</span><span class="p">.</span><span class="nf">times</span><span class="p">{</span>
<span class="n">n</span> <span class="o">=</span> <span class="nb">rand</span><span class="p">(</span><span class="n">m</span><span class="p">)</span>
<span class="n">obj</span><span class="p">.</span><span class="nf">send</span><span class="p">(</span><span class="n">ms</span><span class="p">[</span><span class="n">n</span><span class="p">])</span> <span class="k">if</span> <span class="n">m</span> <span class="o">></span> <span class="mi">0</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="p">}.</span><span class="nf">real</span>
<span class="n">miss_perf1</span> <span class="o"><<</span> <span class="no">Benchmark</span><span class="p">.</span><span class="nf">measure</span><span class="p">{</span>
<span class="mi">10</span><span class="p">.</span><span class="nf">times</span><span class="p">{</span>
<span class="n">objs</span><span class="p">.</span><span class="nf">each</span><span class="p">{</span><span class="o">|</span><span class="n">obj</span><span class="o">|</span>
<span class="n">n</span> <span class="o">=</span> <span class="nb">rand</span><span class="p">(</span><span class="n">m</span><span class="p">)</span>
<span class="n">obj</span><span class="p">.</span><span class="nf">send</span><span class="p">(</span><span class="n">miss_ms</span><span class="p">[</span><span class="n">n</span><span class="p">])</span> <span class="k">if</span> <span class="n">m</span> <span class="o">></span> <span class="mi">0</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="p">}.</span><span class="nf">real</span>
<span class="n">miss_perf2</span> <span class="o"><<</span> <span class="no">Benchmark</span><span class="p">.</span><span class="nf">measure</span><span class="p">{</span>
<span class="n">objs</span><span class="p">.</span><span class="nf">each</span><span class="p">{</span><span class="o">|</span><span class="n">obj</span><span class="o">|</span>
<span class="mi">10</span><span class="p">.</span><span class="nf">times</span><span class="p">{</span>
<span class="n">n</span> <span class="o">=</span> <span class="nb">rand</span><span class="p">(</span><span class="n">m</span><span class="p">)</span>
<span class="n">obj</span><span class="p">.</span><span class="nf">send</span><span class="p">(</span><span class="n">miss_ms</span><span class="p">[</span><span class="n">n</span><span class="p">])</span> <span class="k">if</span> <span class="n">m</span> <span class="o">></span> <span class="mi">0</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="p">}.</span><span class="nf">real</span>
<span class="vg">$results</span> <span class="o"><<</span> <span class="n">classes</span><span class="p">.</span><span class="nf">reduce</span><span class="p">(</span><span class="mi">0</span><span class="p">){</span><span class="o">|</span><span class="n">r</span><span class="p">,</span> <span class="n">klass</span><span class="o">|</span> <span class="n">r</span> <span class="o">+</span> <span class="no">ObjectSpace</span><span class="p">.</span><span class="nf">memsize_of</span><span class="p">(</span><span class="n">klass</span><span class="p">)}</span>
<span class="p">}</span>
<span class="nb">puts</span> <span class="s2">"insert</span><span class="se">\t</span><span class="si">#{</span><span class="n">insert_perf</span><span class="p">.</span><span class="nf">join</span><span class="p">(</span><span class="s2">"</span><span class="se">\t</span><span class="s2">"</span><span class="p">)</span><span class="si">}</span><span class="s2">"</span>
<span class="nb">puts</span> <span class="s2">"hit_perf1</span><span class="se">\t</span><span class="si">#{</span><span class="n">hit_perf1</span><span class="p">.</span><span class="nf">join</span><span class="p">(</span><span class="s2">"</span><span class="se">\t</span><span class="s2">"</span><span class="p">)</span><span class="si">}</span><span class="s2">"</span>
<span class="nb">puts</span> <span class="s2">"hit_perf2</span><span class="se">\t</span><span class="si">#{</span><span class="n">hit_perf2</span><span class="p">.</span><span class="nf">join</span><span class="p">(</span><span class="s2">"</span><span class="se">\t</span><span class="s2">"</span><span class="p">)</span><span class="si">}</span><span class="s2">"</span>
<span class="nb">puts</span> <span class="s2">"miss_perf1</span><span class="se">\t</span><span class="si">#{</span><span class="n">miss_perf1</span><span class="p">.</span><span class="nf">join</span><span class="p">(</span><span class="s2">"</span><span class="se">\t</span><span class="s2">"</span><span class="p">)</span><span class="si">}</span><span class="s2">"</span>
<span class="nb">puts</span> <span class="s2">"miss_perf2</span><span class="se">\t</span><span class="si">#{</span><span class="n">miss_perf2</span><span class="p">.</span><span class="nf">join</span><span class="p">(</span><span class="s2">"</span><span class="se">\t</span><span class="s2">"</span><span class="p">)</span><span class="si">}</span><span class="s2">"</span>
<span class="nb">puts</span> <span class="s2">"mem</span><span class="se">\t</span><span class="si">#{</span><span class="vg">$results</span><span class="p">.</span><span class="nf">join</span><span class="p">(</span><span class="s2">"</span><span class="se">\t</span><span class="s2">"</span><span class="p">)</span><span class="si">}</span><span class="s2">"</span>
</code></pre>
<p>raw data is <a href="https://docs.google.com/spreadsheets/d/1NEbb6p663rc6-6xLZyM2cZQqm1kr96FyCQsYGe6KfS0/edit?usp=sharing" class="external">https://docs.google.com/spreadsheets/d/1NEbb6p663rc6-6xLZyM2cZQqm1kr96FyCQsYGe6KfS0/edit?usp=sharing</a></p>
<p>Charts are attached in one pdf file.</p>
<ul>
<li>st is slow.</li>
<li>surprisingly, other methods are not so different (at least under or equal 100 methods).
<ul>
<li>I expected that insertion is slow on list, but not so slow.</li>
<li>I expected that lookup is fast on small list, but not so different.</li>
</ul>
</li>
<li>maybe we can make faster implementation (sophisticate).</li>
</ul>
<p>Any verifications / reproducible benchmarks are welcome.<br>
Discourse benchmark?</p>
Ruby master - Feature #11420: Introduce ID key table into MRI
https://bugs.ruby-lang.org/issues/11420?journal_id=53759
2015-08-11T20:20:14Z
ko1 (Koichi Sasada)
<ul></ul><p>Ah, I have important notice.</p>
<p>Now, I'm checking method table with Ruby's method definitions and Ruby's method invocations. They are heavy operations, so that they can hide overhead of table operations (but st was slow :p).</p>
<p>So if we apply this table to other purpose, we can see other effect.</p>
Ruby master - Feature #11420: Introduce ID key table into MRI
https://bugs.ruby-lang.org/issues/11420?journal_id=53761
2015-08-12T08:44:24Z
ko1 (Koichi Sasada)
<ul><li><strong>Status</strong> changed from <i>Open</i> to <i>Closed</i></li></ul><p>Applied in changeset r51541.</p>
<hr>
<ul>
<li>id_table.h: introduce ID key table.<br>
[Feature <a class="issue tracker-2 status-5 priority-4 priority-default closed" title="Feature: Introduce ID key table into MRI (Closed)" href="https://bugs.ruby-lang.org/issues/11420">#11420</a>]<br>
This table only manage ID->VALUE table to reduce overhead of st.<br>
Some functions prefixed rb_id_table_* are provided.</li>
<li>id_table.c: implement rb_id_table_*.<br>
There are several algorithms to implement it.<br>
Now, there are roughly 4 types:
<ul>
<li>st</li>
<li>array</li>
<li>hash (implemented by Yura Sokolov)</li>
<li>mix of array and hash<br>
The macro ID_TABLE_IMPL can choose implementation.<br>
You can see detailes about them at the head of id_table.c.<br>
At the default, I choose 34 (mix of list and hash).<br>
This is not final decision.<br>
Please report your suitable parameters or<br>
your data structure.</li>
<li>symbol.c: introduce rb_id_serial_t and rb_id_to_serial()<br>
to represent ID by serial number.</li>
<li>internal.h: use id_table for method tables.</li>
<li>class.c, gc.c, marshal.c, vm.c, vm_method.c: ditto.</li>
</ul>
</li>
</ul>