Project

General

Profile

Actions

Bug #17806

closed

Bad interaction between method cache, prepend, and refinements

Added by alanwu (Alan Wu) 8 months ago. Updated 4 months ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Target version:
-
ruby -v:
ruby 3.0.1p64 (2021-04-05 revision 0fb782ee38) [x86_64-darwin19]
[ruby-core:103469]

Description

I'm running into a couple of issues with Ruby 3's new method cache and
refinements.

The first script raises SystemStackError unexpectedly:

module R1
  refine Hash do
    def foo; :r1; end
  end
end
class Hash
  prepend(Module.new)
end
class Hash
  def foo; end
end
{}.method(:foo) # put it on pCMC
module R2
  refine Hash do
    def foo; :r2; end
  end
end
{}.foo # SystemStackError

The second script calls the wrong method:

klass = Class.new { def foo; end }
_refinement = Module.new do
  refine(klass) { def foo; :refined; end }
end
klass.prepend(Module.new)
klass.new.foo # cache foo
klass.define_method(:foo) { :second }
p klass.new.foo # prints nil. False caching.

I submitted a GitHub PR to fix the issue: https://github.com/ruby/ruby/pull/4386

Actions #1

Updated by alanwu (Alan Wu) 7 months ago

  • Status changed from Open to Closed

Applied in changeset git|39a2ba5cc559900c30c3143da32446c2f20a7484.


Method cache: fix refinement entry handling

To invalidate some callable method entries, we replace the entry in the
class. Most types of method entries are on the method table of the
origin class, but refinement entries without an orig_me are housed in
the method table of the class itself. They are there because refinements
take priority over prepended methods.

By unconditionally inserting a copy of the refinement entry into the
origin class, clearing the method cache created situations where there
are refinement entry duplicates in the lookup chain, leading to infinite
loops and other problems.

Update the replacement logic to use the right class that houses the
method entry. Also, be more selective about cache invalidation when
moving refinement entries for prepend. This avoids calling
clear_method_cache_by_id_in_class() before refinement entries are in the
place it expects.

[Bug #17806]

Updated by nagachika (Tomoyuki Chikanaga) 6 months ago

  • Backport changed from 2.6: UNKNOWN, 2.7: UNKNOWN, 3.0: REQUIRED to 2.6: UNKNOWN, 2.7: UNKNOWN, 3.0: DONE

ruby_3_0 d47df50678b00bd622e6be474031204ed2e52b31 merged revision(s) 39a2ba5cc559900c30c3143da32446c2f20a7484.

Updated by mk (Matthias Käppler) 4 months ago

Hi alanwu (Alan Wu),

at GitLab we are currently working on making our application Ruby 3 ready.

We are running into a strange issue in our test suite where refinements fails to be applied when everything suggests they should be.

During my research I stumbled on this issue here and I was wondering if you could give me your thoughts on whether this could be a Ruby VM bug.

I summarized in this issue what the problem is: https://gitlab.com/gitlab-org/gitlab/-/issues/337614

To recap the high-level highlights and observations:

  • We use rspec-parameterized and its table-syntax extension (which overrides the pipe | operator using refinements)
  • Whenever we run a test that composes a test table using integer arguments, Ruby will invoke the built-in bitwise-OR instead of the refinement, thus breaking the test
  • When I add another refinement to Integer at the top of the test file, even when not using it, the test will start to pass

This leads me to believe it could be due to Ruby's method table being messed up somehow, since it is very suspicious that by refining a class with an unused method, another refinement should suddenly start to apply. Since this changeset here touches the caching behavior, I thought you might have an idea whether it could be related.

We also verified that this happens not just with 3.0.2, but also 3.0.1, but works on 2.7, so it is likely not strictly related to this particular change. But since I have no evidence that it is in fact a Ruby bug, I was also hesitant to open a new issue.

I have so far not been able to produce a minimal executable test case either.

Any hints and thoughts for how to debug this would be appreciated!

Updated by alanwu (Alan Wu) 4 months ago

Hi mk (Matthias Käppler), from the description you gave, it's possible that you are running into #17725.
I would try running these two lines before requiring the test file that contains the | call:

iseq = RubyVM::InstructionSequence
iseq.compile_option = iseq.compile_option.merge(specialized_instruction: false)

If that fixes the issue I would try to run the test unmodified with a build of the master branch.

If the issue still exists on the master branch, I would try creating a reduced reproducer by logging ::Integer.ancestors before each | call and refinement creation, assuming it's an issue related to lookup caching.

Updated by mk (Matthias Käppler) 4 months ago

Thanks alanwu (Alan Wu)!

I forgot to watch this issue so I was never notified of your response. Sorry for the delay.

I would try running these two lines before requiring the test file that contains the | call:

iseq = RubyVM::InstructionSequence
iseq.compile_option = iseq.compile_option.merge(specialized_instruction: false)

Thank you so much -- I was eventually able to verify that this is indeed the problem. Moving this in front of application.rb (this is a Rails app) fixes the test.

If that fixes the issue I would try to run the test unmodified with a build of the master branch.

Yep, we have a system in place that applies custom patches to our CI and production Rubies, and I was able to verify that applying https://bugs.ruby-lang.org/projects/ruby-master/repository/git/revisions/fb4cf204a662a8cd9dafef6f31f2bd0db9129abe/diff and https://bugs.ruby-lang.org/projects/ruby-master/repository/git/revisions/fa0279d947c3962c3f8c32852278d3ebb964cb19/diff to 3.0.2 resolves the issue (the former turns the test green, the latter fixes a segfault I ran into after applying the former).

Again, I really appreciate your help -- this caused me quite the headache :-)

Actions

Also available in: Atom PDF