https://bugs.ruby-lang.org/https://bugs.ruby-lang.org/favicon.ico?17113305112020-02-07T09:10:15ZRuby Issue Tracking SystemRuby master - Feature #16614: New method cache mechanism for Guildhttps://bugs.ruby-lang.org/issues/16614?journal_id=841942020-02-07T09:10:15Zko1 (Koichi Sasada)
<ul><li><strong>Description</strong> updated (<a title="View differences" href="/journals/84194/diff?detail_id=56348">diff</a>)</li></ul><p>To make <code>CCs</code> data structure (CI/CC pair entries), CI is also VALUE. But most of case (instead of using keyword parameters) it can be packed in VALUE like Fixnum.</p>
<p>This fix is <a href="https://github.com/ruby/ruby/pull/2888/commits/db33bf07d5dc1ecfe861f464a9274f8783faa70e" class="external">https://github.com/ruby/ruby/pull/2888/commits/db33bf07d5dc1ecfe861f464a9274f8783faa70e</a></p> Ruby master - Feature #16614: New method cache mechanism for Guildhttps://bugs.ruby-lang.org/issues/16614?journal_id=841992020-02-07T17:34:51Ztenderlovemaking (Aaron Patterson)tenderlove@ruby-lang.org
<ul></ul><p>This is great!</p>
<p>I have one question, do pCMC's contain entries from only <em>that</em> class, or that class plus parents?</p> Ruby master - Feature #16614: New method cache mechanism for Guildhttps://bugs.ruby-lang.org/issues/16614?journal_id=842002020-02-08T03:03:57Zko1 (Koichi Sasada)
<ul></ul><p>tenderlovemaking (Aaron Patterson) wrote in <a href="#note-2">#note-2</a>:</p>
<blockquote>
<p>I have one question, do pCMC's contain entries from only <em>that</em> class, or that class plus parents?</p>
</blockquote>
<p>plus parents.</p>
<p>Assume:</p>
<ul>
<li>there are 3 classes <code>C3 < C2 < C1</code> and</li>
<li>make an object <code>c3 = C3.new</code> then</li>
<li>calling <code>c3.c1_foo; c3.c2_foo; c3.c3_foo</code> each methods defined in C1, C2, C3 respectively,</li>
<li>calling <code>c3.c1_foo(1, 2)</code> because <code>C1#c1_foo</code> accept a rest argument,</li>
<li>then C3's pCMC table (mid -> CCs) contains:
<ul>
<li><code>:c1_foo -> C1#c1_foo, [[CI(argc:0), CC(C1#c1_foo)], [CI(argc:2, CC()C1#c1_foo]]</code></li>
<li><code>:c2_foo -> C1#c2_foo, [[CI(argc:0), CC(C2#c1_foo)]]</code></li>
<li><code>:c3_foo -> C1#c3_foo, [[CI(argc:0), CC(C3#c1_foo)]]</code></li>
</ul>
</li>
</ul>
<p>This examples shows one issue. <code>C1#c1_foo</code> accepts any arguments, so CCs can increase infinitely in theory.</p> Ruby master - Feature #16614: New method cache mechanism for Guildhttps://bugs.ruby-lang.org/issues/16614?journal_id=842162020-02-10T08:15:53ZHanmac (Hans Mackowiak)hanmac@gmx.de
<ul></ul><p>does this has any effect or is affected by method missing?</p>
<p>like should method missing be cached like that? (I think maybe not?)</p> Ruby master - Feature #16614: New method cache mechanism for Guildhttps://bugs.ruby-lang.org/issues/16614?journal_id=842242020-02-11T18:09:51Zheadius (Charles Nutter)headius@headius.com
<ul></ul><p>This is very nearly how JRuby works today.</p>
<p>Each class has a local cache of method lookups. This cache and call sites in code aggregate a tuple of [method, class serial]. Defining a new method causes a cascade down-hierarchy to flip all serial numbers so any methods cached with those serials are no longer used.</p>
<p>When using JVM invokedynamic, we use a JVM safepoint instead of a serial number (so the JVM can efficiently deoptimize on invalidate rather than checking the serial), but the mechanism is largely the same.</p>
<p>The biggest "problem" with the approach in JRuby is the large cost of invalidating down a hierarchy when adding a method higher up. For example, early patches to Kernel or Object repeatedly cause hundreds of descendant classes to invalidate.</p>
<p>JRuby does not separate this serial number/safe point invalidation on a per-method basis, which might reduce some of the invalidation overhead at the cost of having more invalidators.</p>
<p>I have experimented with putting the caching and invalidation logic directly into the method cache tuple, and invalidating directly, but never gone forward with that. If I recall correctly, it added significant complexity without actually reducing the cost of most invalidations.</p>
<p>One mechanism that has helped invalidation is to not trigger the safe point for any class from which no methods have been cached. So if the class has never been used as a lookup target, or if it has recently been invalidated, subsequent invalidations have no cost.</p>
<p>I'd love to chat more about your ideas and how we can cooperate to come up with a more efficient invalidation mechanism.</p> Ruby master - Feature #16614: New method cache mechanism for Guildhttps://bugs.ruby-lang.org/issues/16614?journal_id=842262020-02-11T18:33:54Zheadius (Charles Nutter)headius@headius.com
<ul></ul><blockquote>
<p>does this has any effect or is affected by method missing?</p>
</blockquote>
<p>In JRuby, the lookup of method_missing is cached, and that goes through a normal call site mechanism. So if we can't find a method to cache and call, we will cache and call method_missing.</p>
<p>This invalidates correctly, since defining a new method will cause such a call site to re-cache.</p>
<p>Note that this approach is more complicated for respond_to? caching, since that has to go through a lot more hoops (respond_to_missing? etc). A simple respond_to? (i.e. one that is not determined dynamically via respond_to_missing?) gets cached just like a call site, which could also be applied to ko1's logic described here.</p> Ruby master - Feature #16614: New method cache mechanism for Guildhttps://bugs.ruby-lang.org/issues/16614?journal_id=842282020-02-11T21:18:57ZEregon (Benoit Daloze)
<ul></ul><p>FWIW, in TruffleRuby it used be an Assumption (basically a class serial, but with no cost in JIT'ed code) per module, and then invalidate all subclasses. Like MRI before this change basically.</p>
<p>However that spent a lot of time invalidating all those subclasses while loading code.</p>
<p>The problem becomes even worse for constants with lexical lookups.<br>
That means each constant lookup needs to either check every namespace every time, or we need to invalidate not only subclasses but every module that was enclosing (=Module.nesting) a constant lookup for that class.<br>
That means we can have cycles, and so invalidation gets even more expensive by needing to track those cycles somehow.<br>
Keeping a weak list of subclasses is also some cost.</p>
<p>In current TruffleRuby, every module has an Assumption, and we simply collect the Assumption of every module we went through during the initial lookup.<br>
We then need to invalidate only the module on which the new method is defined.<br>
This also means if Object#m is defined, and some call site has an inline cache on C#m, it's not invalidated because lookup never went higher than C.<br>
The drawback is multiple checks per call site, but given those are Assumption checks they have no cost in JIT'ed code, only in interpreter.</p>
<p>I've been thinking to try having an Assumption per (module, method name) to avoid unrelated invalidation. The obvious drawback is footprint. However I'm thinking in that mode it's probably almost never needed to invalidate anything. Calls are likely to only happen once the relevant method has been defined (otherwise it would just raise from method_missing, or weirdly use a superclass behavior which is likely a bug).</p>
<p>Storing the invalidation bit on the method entry in the module's method_table sounds like an interesting idea. The main drawback I think is as you showed you need to make a copy of the entry when invalidating and insert it in the method_table, so to let inline caches which would still call that method to cache on the copy (and not keep seeing the invalidated entry and not cache at all, like D1 above).</p>
<p>This seems to also mean marking a method entry C#m as invalid would invalidate all call sites inline caching C#m, even if those call sites just see C instances and the new method is D#m (D<C).</p> Ruby master - Feature #16614: New method cache mechanism for Guildhttps://bugs.ruby-lang.org/issues/16614?journal_id=842322020-02-12T00:01:09Zheadius (Charles Nutter)headius@headius.com
<ul></ul><blockquote>
<p>In current TruffleRuby, every module has an Assumption, and we simply collect the Assumption of every module we went through during the initial lookup.</p>
</blockquote>
<p>The chaining of Assumption (which in my case would be an indy SwitchPoint) is an interesting approach. In theory once they all inline it should boil down to a single safepoint that can be triggered by multiple isolated invalidators. I may give that a try in JRuby. My main concern without having tried it would be creating excessively long chains of SwitchPoint and the impact to startup and warmup time. Until they inline, they'll be executed manually, deepening the stack at each level (as you mention, it impacts interpreter performance).</p>
<p>Assuming that everything is able to JIT, though, a chain of SwitchPoint should optimize in JRuby pretty much like your chain of Assumptions.</p>
<blockquote>
<p>The problem becomes even worse for constants with lexical lookups.</p>
</blockquote>
<p>We have never attempted to localize constant lookups for exactly this reason; you need to be able to invalidate not just modules but scopes, and that requires that every module knows from what scopes it can be seen. I didn't think the compexity was worth it for caching constants that should usually be immutable.</p>
<p>We did pay a global invalidation cost until we followed MRI's lead in setting up an invalidator per constant name; now the cost of invalidation is so minimal (because there's rarely many constants for a given name) I'm not sure it's worth trying to cache more locally based on lexical scopes.</p> Ruby master - Feature #16614: New method cache mechanism for Guildhttps://bugs.ruby-lang.org/issues/16614?journal_id=842392020-02-12T04:26:37Zshyouhei (Shyouhei Urabe)shyouhei@ruby-lang.org
<ul></ul><p>Exercised some in-depth review of the pull request. I left many small questions and suggestions at github. However there seems no fundamental flaws. I guess it works.</p>
<p>I'm not sure <em>how well</em> it works, though.</p> Ruby master - Feature #16614: New method cache mechanism for Guildhttps://bugs.ruby-lang.org/issues/16614?journal_id=842402020-02-12T04:30:06Zshyouhei (Shyouhei Urabe)shyouhei@ruby-lang.org
<ul></ul><p>A small sidenote: when I tested before, majority of GMC entries were accessed not because sporadic IMC misshits, but from inside of <code>vm_call_opt_send</code> (where it is impossible to use IMC). This patch eliminates GMC so performance of <code>Object#send</code> can be affected positively/negatively. You might want to benchmark.</p> Ruby master - Feature #16614: New method cache mechanism for Guildhttps://bugs.ruby-lang.org/issues/16614?journal_id=842412020-02-12T06:40:24Zko1 (Koichi Sasada)
<ul></ul><p>shyouhei (Shyouhei Urabe) wrote in <a href="#note-10">#note-10</a>:</p>
<blockquote>
<p>This patch eliminates GMC so performance of <code>Object#send</code> can be affected positively/negatively. You might want to benchmark.</p>
</blockquote>
<p>pCMC will support it.</p> Ruby master - Feature #16614: New method cache mechanism for Guildhttps://bugs.ruby-lang.org/issues/16614?journal_id=842422020-02-12T06:43:01Zko1 (Koichi Sasada)
<ul></ul><p>Hanmac (Hans Mackowiak) wrote in <a href="#note-4">#note-4</a>:</p>
<blockquote>
<p>does this has any effect or is affected by method missing?</p>
<p>like should method missing be cached like that? (I think maybe not?)</p>
</blockquote>
<p>Good question.</p>
<ul>
<li>Current IMC caches <code>method_missing</code> for undef method call</li>
<li>Proposed IMC doesn't. But pCMC caches <code>method_missing</code> for the receiver's class => not so fast (because no IMC), but not so slow (pCMC will prevent class-tree traversal).</li>
</ul> Ruby master - Feature #16614: New method cache mechanism for Guildhttps://bugs.ruby-lang.org/issues/16614?journal_id=842432020-02-12T09:53:41ZHanmac (Hans Mackowiak)hanmac@gmx.de
<ul></ul><p>My thought if the <code>respond_to_missing?</code> and <code>method_missing</code> thing depends on external Values like for a <code>OpenStruct</code> thing where respond_to_missing? returns true only if some other instance variable has the key</p> Ruby master - Feature #16614: New method cache mechanism for Guildhttps://bugs.ruby-lang.org/issues/16614?journal_id=842442020-02-12T09:56:09Zko1 (Koichi Sasada)
<ul></ul><p>Hanmac (Hans Mackowiak) wrote in <a href="#note-13">#note-13</a>:</p>
<blockquote>
<p>My thought if the <code>respond_to_missing?</code> and <code>method_missing</code> thing depends on external Values like for a <code>OpenStruct</code> thing where respond_to_missing? returns true only if some other instance variable has the key</p>
</blockquote>
<p>Could you explain what is your conclusion of your comment?<br>
For method caching, it should be conservative, in general (because it is only a cache).</p> Ruby master - Feature #16614: New method cache mechanism for Guildhttps://bugs.ruby-lang.org/issues/16614?journal_id=842592020-02-13T13:17:53Zalanwu (Alan Wu)
<ul></ul><p>If I understand this correctly, the proposed implementation can trigger an allocation on the GC heap for a CC when there is a cache miss.<br>
While the pCMC puts a bound on the number of allocations per polymorphic call site, the design seems like it would allocate a lot on megamorphic sites.<br>
I'm particularly worried about boot performance, when the caches are cold. It would also be nice to get numbers about impact on memory usage.</p> Ruby master - Feature #16614: New method cache mechanism for Guildhttps://bugs.ruby-lang.org/issues/16614?journal_id=843082020-02-19T05:14:41Zko1 (Koichi Sasada)
<ul></ul><p>alanwu (Alan Wu) wrote in <a href="#note-15">#note-15</a>:</p>
<blockquote>
<p>If I understand this correctly, the proposed implementation can trigger an allocation on the GC heap for a CC when there is a cache miss.</p>
</blockquote>
<p>Correct.</p>
<blockquote>
<p>While the pCMC puts a bound on the number of allocations per polymorphic call site, the design seems like it would allocate a lot on megamorphic sites.</p>
</blockquote>
<p>Yes.</p>
<blockquote>
<p>I'm particularly worried about boot performance, when the caches are cold. It would also be nice to get numbers about impact on memory usage.</p>
</blockquote>
<p>On simple Rails application (scaffold only app), I measured by accessing 1000 times (<code>ab -n 1000</code>) on production environment, I got the following debug counter result.</p>
<p>Yes, boot performance can be affected.</p>
<pre><code>[RUBY_DEBUG_COUNTER] cc_new 87,759
[RUBY_DEBUG_COUNTER] iseq_cd_num 131,841
[RUBY_DEBUG_COUNTER] mc_inline_hit 6,699,074
[RUBY_DEBUG_COUNTER] mc_inline_miss_klass 424,326
[RUBY_DEBUG_COUNTER] mc_inline_miss_disabled 56
</code></pre>
<p>Compare with the number of call_data number (iseq_cd_num), the number of created CC (cc_new) is not so high.</p> Ruby master - Feature #16614: New method cache mechanism for Guildhttps://bugs.ruby-lang.org/issues/16614?journal_id=843252020-02-20T07:45:28Zko1 (Koichi Sasada)
<ul></ul><p>I'll merge this patch tomorrow.<br>
If you have any suggestion, please tell me.</p> Ruby master - Feature #16614: New method cache mechanism for Guildhttps://bugs.ruby-lang.org/issues/16614?journal_id=843262020-02-20T08:33:28ZHanmac (Hans Mackowiak)hanmac@gmx.de
<ul></ul><p>ko1 (Koichi Sasada) wrote in <a href="#note-14">#note-14</a>:</p>
<blockquote>
<p>Hanmac (Hans Mackowiak) wrote in <a href="#note-13">#note-13</a>:</p>
<blockquote>
<p>My thought if the <code>respond_to_missing?</code> and <code>method_missing</code> thing depends on external Values like for a <code>OpenStruct</code> thing where respond_to_missing? returns true only if some other instance variable has the key</p>
</blockquote>
<p>Could you explain what is your conclusion of your comment?<br>
For method caching, it should be conservative, in general (because it is only a cache).</p>
</blockquote>
<p>my thoughts were: if <code>respond_to_missing?</code> returns true once, does this have any effect on this cache? even if it might later return false?</p> Ruby master - Feature #16614: New method cache mechanism for Guildhttps://bugs.ruby-lang.org/issues/16614?journal_id=843272020-02-20T09:19:30Zko1 (Koichi Sasada)
<ul></ul><blockquote>
<p>my thoughts were: if respond_to_missing? returns true once, does this have any effect on this cache? even if it might later return false?</p>
</blockquote>
<p>Just now, no effect.</p> Ruby master - Feature #16614: New method cache mechanism for Guildhttps://bugs.ruby-lang.org/issues/16614?journal_id=843442020-02-22T00:59:26Zko1 (Koichi Sasada)
<ul><li><strong>Status</strong> changed from <i>Open</i> to <i>Closed</i></li></ul><p>Applied in changeset <a class="changeset" title="Introduce disposable call-cache. This patch contains several ideas: (1) Disposable inline metho..." href="https://bugs.ruby-lang.org/projects/ruby-master/repository/git/revisions/b9007b6c548f91e88fd3f2ffa23de740431fa969">git|b9007b6c548f91e88fd3f2ffa23de740431fa969</a>.</p>
<hr>
<p>Introduce disposable call-cache.</p>
<p>This patch contains several ideas:</p>
<p>(1) Disposable inline method cache (IMC) for race-free inline method cache<br>
* Making call-cache (CC) as a RVALUE (GC target object) and allocate new<br>
CC on cache miss.<br>
* This technique allows race-free access from parallel processing<br>
elements like RCU.<br>
(2) Introduce per-Class method cache (pCMC)<br>
* Instead of fixed-size global method cache (GMC), pCMC allows flexible<br>
cache size.<br>
* Caching CCs reduces CC allocation and allow sharing CC's fast-path<br>
between same call-info (CI) call-sites.<br>
(3) Invalidate an inline method cache by invalidating corresponding method<br>
entries (MEs)<br>
* Instead of using class serials, we set "invalidated" flag for method<br>
entry itself to represent cache invalidation.<br>
* Compare with using class serials, the impact of method modification<br>
(add/overwrite/delete) is small.<br>
* Updating class serials invalidate all method caches of the class and<br>
sub-classes.<br>
* Proposed approach only invalidate the method cache of only one ME.</p>
<p>See [Feature <a class="issue tracker-2 status-5 priority-4 priority-default closed" title="Feature: New method cache mechanism for Guild (Closed)" href="https://bugs.ruby-lang.org/issues/16614">#16614</a>] for more details.</p> Ruby master - Feature #16614: New method cache mechanism for Guildhttps://bugs.ruby-lang.org/issues/16614?journal_id=844072020-02-27T09:49:04Zvo.x (Vit Ondruch)v.ondruch@tiscali.cz
<ul><li><strong>Related to</strong> <i><a class="issue tracker-1 status-5 priority-4 priority-default closed" href="/issues/16658">Bug #16658</a>: `method__cache__clear` DTrace hook was dropped without replacement</i> added</li></ul> Ruby master - Feature #16614: New method cache mechanism for Guildhttps://bugs.ruby-lang.org/issues/16614?journal_id=877922020-09-29T03:37:28Zhsbt (Hiroshi SHIBATA)hsbt@ruby-lang.org
<ul><li><strong>Target version</strong> changed from <i>36</i> to <i>3.0</i></li></ul> Ruby master - Feature #16614: New method cache mechanism for Guildhttps://bugs.ruby-lang.org/issues/16614?journal_id=889832020-12-07T23:13:07ZEregon (Benoit Daloze)
<ul><li><strong>Related to</strong> <i><a class="issue tracker-1 status-5 priority-4 priority-default closed" href="/issues/17373">Bug #17373</a>: Ruby 3.0 is slower at Discourse bench than Ruby 2.7</i> added</li></ul>