https://bugs.ruby-lang.org/https://bugs.ruby-lang.org/favicon.ico?17113305112013-03-22T15:29:11ZRuby Issue Tracking SystemBackport200 - Backport #8142: [patch] iseq: reduce array allocations for simple sequenceshttps://bugs.ruby-lang.org/issues/8142?journal_id=378022013-03-22T15:29:11Zko1 (Koichi Sasada)
<ul></ul><p>(2013/03/22 14:30), tmm1 (Aman Gupta) wrote:</p>
<blockquote>
<p>Allocate iseq->mark_ary on demand, only if needed.</p>
<p>In my application, this reduces long lived arrays on the heap significantly.</p>
</blockquote>
<p>Ah, I got it. Nice!! I'll introduce it.</p>
<h2></h2>
<p>BTW, in the future, I want avoid any live object relations from iseq.<br>
For example, the program "str = 'hello'" (compiled iseq) has one<br>
relation to a String object. However, this String object can be replaced<br>
with non-VALUE memory object (not a VALUE, but a memory dump). If I can<br>
replace with this technique, iseq->mark_ary is no longer needed.</p>
<p>--<br>
// SASADA Koichi at atdot dot net</p> Backport200 - Backport #8142: [patch] iseq: reduce array allocations for simple sequenceshttps://bugs.ruby-lang.org/issues/8142?journal_id=378062013-03-22T17:19:03Ztmm1 (Aman Karmani)ruby@tmm1.net
<ul></ul><blockquote>
<p>For example, the program "str = 'hello'" (compiled iseq) has one<br>
relation to a String object. However, this String object can be replaced<br>
with non-VALUE memory object (not a VALUE, but a memory dump).</p>
</blockquote>
<p>Do you have any existing patch for this technique? I would like to try, for example to convert putstring instruction.</p>
<p>In my application there are lots of long lived strings. Many of these strings come from string literals in code.</p>
<blockquote>
<blockquote>
<p>GC.start<br>
ObjectSpace.count_objects[:T_STRING]<br>
=> 311117</p>
</blockquote>
</blockquote>
<blockquote>
<blockquote>
<p>ObjectSpace.each_object(String).count<br>
=> 305230</p>
</blockquote>
</blockquote>
<blockquote>
<blockquote>
<p>ObjectSpace.each_object(String).select{ |s| s.frozen? }.size<br>
=> 233336</p>
</blockquote>
</blockquote>
<p>Also I see a high level of frozen string duplication. Some of this is due to duplication of common iseq->location.label (like "initialize")</p>
<blockquote>
<blockquote>
<p>ObjectSpace.each_object(String).select{ |s| s.frozen? }.uniq.size<br>
=> 118043</p>
</blockquote>
</blockquote> Backport200 - Backport #8142: [patch] iseq: reduce array allocations for simple sequenceshttps://bugs.ruby-lang.org/issues/8142?journal_id=378102013-03-22T17:59:15Zko1 (Koichi Sasada)
<ul></ul><p>(2013/03/22 17:19), tmm1 (Aman Gupta) wrote:</p>
<blockquote>
<p>Do you have any existing patch for this technique? I would like to try, for example to convert putstring instruction.</p>
<p>In my application there are lots of long lived strings. Many of these strings come from string literals in code.</p>
</blockquote>
<p>Yes. This is why I want to try it.<br>
And this is good to hear we have killer user :)</p>
<p>Already, we have rough design about it.<br>
This design includes Symbol GC :)<br>
The key idea is making static string (or binary) table with <em>reference<br>
counter</em>.<br>
We had (Heroku Matz team) discussed about it.</p>
<p>However, now I'm trying GC improvements (restricted gen gc).<br>
So priority is not best.<br>
I want to finish before RubyKaigi2013.<br>
(Event Driven Development)</p>
<p>--<br>
// SASADA Koichi at atdot dot net</p> Backport200 - Backport #8142: [patch] iseq: reduce array allocations for simple sequenceshttps://bugs.ruby-lang.org/issues/8142?journal_id=379112013-03-25T14:29:11Ztmm1 (Aman Karmani)ruby@tmm1.net
<ul></ul><blockquote>
<p>However, now I'm trying GC improvements (restricted gen gc).</p>
</blockquote>
<p>This is great news. With an incremental mark, long lived objects are less of a problem.</p>
<p>What is your plan for restricted GC?</p> Backport200 - Backport #8142: [patch] iseq: reduce array allocations for simple sequenceshttps://bugs.ruby-lang.org/issues/8142?journal_id=379222013-03-26T00:53:14Zko1 (Koichi Sasada)
<ul></ul><p>(2013/03/25 14:29), tmm1 (Aman Gupta) wrote:</p>
<blockquote>
<p>What is your plan for restricted GC?</p>
</blockquote>
<p>I'll make another feature request.</p>
<a name="On-a-preliminary-evaluation-simple-ideal-micro-benchmark-was"></a>
<h1 >On a preliminary evaluation, simple (ideal) micro benchmark was<a href="#On-a-preliminary-evaluation-simple-ideal-micro-benchmark-was" class="wiki-anchor">¶</a></h1>
<a name="20-30-faster"></a>
<h1 >20-30% faster.<a href="#20-30-faster" class="wiki-anchor">¶</a></h1>
<p>--<br>
// SASADA Koichi at atdot dot net</p> Backport200 - Backport #8142: [patch] iseq: reduce array allocations for simple sequenceshttps://bugs.ruby-lang.org/issues/8142?journal_id=379282013-03-26T10:22:49Ztmm1 (Aman Karmani)ruby@tmm1.net
<ul></ul><blockquote>
<p>I'll make another feature request.</p>
<a name="On-a-preliminary-evaluation-simple-ideal-micro-benchmark-was"></a>
<h1 >On a preliminary evaluation, simple (ideal) micro benchmark was<a href="#On-a-preliminary-evaluation-simple-ideal-micro-benchmark-was" class="wiki-anchor">¶</a></h1>
<a name="20-30-faster"></a>
<h1 >20-30% faster.<a href="#20-30-faster" class="wiki-anchor">¶</a></h1>
</blockquote>
<p>Great. I tried some GC experiments recently (additional bit per object to track longlife generation), but without write barrier it was very tricky to implement. I look forward to seeing your idea.</p>
<blockquote>
<p>However, this String object can be replaced<br>
with non-VALUE memory object (not a VALUE, but a memory dump).</p>
</blockquote>
<p>I am doing some experiments with this technique. For the putstring instruction, this is very simple- replace rb_str_resurrect() with rb_str_new2() using memory dumped value.</p>
<p>But in DSTR, putobject instruction is used instead of putstring (<a href="https://github.com/ruby/ruby/commit/49371b54" class="external">https://github.com/ruby/ruby/commit/49371b54</a>). Replacing this with rb_str_new every time will increase GC pressure, so it is not ideal.</p>
<p>One solution is to make memory dump re-use struct RString, so it can emulate a string object (special API to allocate strings outside the ruby heap, using ALLOC(struct RString)). These objects can also contain an extra reference count field.</p>
<p>But if putobject instruction gives out reference to these unmanaged object, then reference count is not enough to free. An additional GC mark/sweep will be required after refcount==0, to make sure no one is still referencing the string. This is still possible (similar technique is used for unlinked method entries?). I have some patches for this approach, but I am curious what you think. Is this a bad idea?</p> Backport200 - Backport #8142: [patch] iseq: reduce array allocations for simple sequenceshttps://bugs.ruby-lang.org/issues/8142?journal_id=386492013-04-17T20:20:25Ztmm1 (Aman Karmani)ruby@tmm1.net
<ul><li><strong>Status</strong> changed from <i>Open</i> to <i>Closed</i></li><li><strong>% Done</strong> changed from <i>0</i> to <i>100</i></li></ul><p>This issue was solved with changeset r40336.<br>
Aman, thank you for reporting this issue.<br>
Your contribution to Ruby is greatly appreciated.<br>
May Ruby be with you.</p>
<hr>
<p>iseq: reduce array allocations for simple sequences</p>
<ul>
<li>
<p>compile.c (iseq_add_mark_object): Use new rb_iseq_add_mark_object().</p>
</li>
<li>
<p>insns.def (setinlinecache): Ditto.</p>
</li>
<li>
<p>iseq.c (rb_iseq_add_mark_object): New function to allocate<br>
iseq->mark_ary on demand. [Bug <a class="issue tracker-4 status-5 priority-4 priority-default closed" title="Backport: [patch] iseq: reduce array allocations for simple sequences (Closed)" href="https://bugs.ruby-lang.org/issues/8142">#8142</a>]</p>
</li>
<li>
<p>iseq.h (rb_iseq_add_mark_object): Ditto.</p>
</li>
<li>
<p>iseq.c (prepare_iseq_build): Avoid allocating mark_ary until needed.</p>
</li>
<li>
<p>iseq.c (rb_iseq_build_for_ruby2cext): Ditto.</p>
</li>
</ul> Backport200 - Backport #8142: [patch] iseq: reduce array allocations for simple sequenceshttps://bugs.ruby-lang.org/issues/8142?journal_id=392652013-05-12T10:50:30Ztmm1 (Aman Karmani)ruby@tmm1.net
<ul><li><strong>Tracker</strong> changed from <i>Bug</i> to <i>Backport</i></li><li><strong>Project</strong> changed from <i>Ruby master</i> to <i>Backport200</i></li><li><strong>Status</strong> changed from <i>Closed</i> to <i>Assigned</i></li><li><strong>Assignee</strong> changed from <i>ko1 (Koichi Sasada)</i> to <i>nagachika (Tomoyuki Chikanaga)</i></li></ul> Backport200 - Backport #8142: [patch] iseq: reduce array allocations for simple sequenceshttps://bugs.ruby-lang.org/issues/8142?journal_id=401782013-06-28T02:32:42Znagachika (Tomoyuki Chikanaga)nagachika00@gmail.com
<ul><li><strong>Status</strong> changed from <i>Assigned</i> to <i>Closed</i></li></ul><p>This issue was solved with changeset r41682.<br>
Aman, thank you for reporting this issue.<br>
Your contribution to Ruby is greatly appreciated.<br>
May Ruby be with you.</p>
<hr>
<p>merge revision(s) 40336: [Backport <a class="issue tracker-4 status-5 priority-4 priority-default closed" title="Backport: [patch] iseq: reduce array allocations for simple sequences (Closed)" href="https://bugs.ruby-lang.org/issues/8142">#8142</a>]</p>
<pre><code>* compile.c (iseq_add_mark_object): Use new rb_iseq_add_mark_object().
* insns.def (setinlinecache): Ditto.
* iseq.c (rb_iseq_add_mark_object): New function to allocate
iseq->mark_ary on demand. [Bug #8142]
* iseq.h (rb_iseq_add_mark_object): Ditto.
* iseq.c (prepare_iseq_build): Avoid allocating mark_ary until needed.
* iseq.c (rb_iseq_build_for_ruby2cext): Ditto.
</code></pre>