Project

General

Profile

Actions

Bug #21201

open

Performance regression when defining methods inside `refine` blocks

Added by alpaca-tc (Hiroyuki Ishii) 4 days ago. Updated 3 days ago.

Status:
Open
Assignee:
-
Target version:
-
ruby -v:
3.3.7, 3.4.2, 3.5.0
[ruby-core:121451]

Description

After the change introduced in PR #10037, a significant performance regression has been observed when defining methods within refine blocks.

The PR correctly fixes a bug where callcache invalidation was missing when methods are defined inside a refinement. To fix this, it now invokes rb_clear_all_refinement_method_cache() at the time of method definition within refine. However, this function performs a full object space traversal via rb_objspace_each_objects, making it extremely expensive.

In our team's Rails application, refinements are used heavily. During application code reload (triggered in development mode), rb_clear_all_refinement_method_cache() is called 1425 times, 1142 of which originate via rb_clear_method_cache(). As a result, the code reload time has increased from 5 seconds to 15 seconds after the patch. Since reloading occurs every time the code changes, this degrades development experience severely.

Reproduction script:

require "bundler/inline"

gemfile(true) do
  source "https://rubygems.org"

  gem "benchmark-ips"
end

mod = Module.new
klass = Class.new

Benchmark.ips do |x|
  x.report(RUBY_VERSION) do
    mod.send(:refine, klass) do
      def call_1 = nil
      def call_2 = nil
      def call_3 = nil
    end
  end

  x.save! "/tmp/performance_regression_refine.bench"
  x.compare!
end

Benchmark results:

Comparison:
               3.2.7:  1500316.7 i/s
               3.3.7:      158.0 i/s - 9496.46x  slower
               3.5.0:      124.6 i/s - 12041.43x  slower
               3.4.2:      119.5 i/s - 12553.50x  slower

Related Slack discussion: https://ruby-jp.slack.com/archives/CLWSHA76V/p1741932018811539
I totally understand the necessity and value of the past PR that fixed the callcache bugs. That said, I was wondering if there might be any ideas to make it faster, or to handle the invalidation more efficiently.

Updated by byroot (Jean Boussier) 4 days ago

I was wondering if there might be any ideas to make it faster, or to handle the invalidation more efficiently.

The solution is likely to keep the list (set) of all existing call caches that belong to a refinement, as to save the object space traversal.

But yes, this performance issue is why at work we disallowed usage of refinement using rubocop.

Updated by tenderlovemaking (Aaron Patterson) 3 days ago

byroot (Jean Boussier) wrote in #note-1:

I was wondering if there might be any ideas to make it faster, or to handle the invalidation more efficiently.

The solution is likely to keep the list (set) of all existing call caches that belong to a refinement, as to save the object space traversal.

But yes, this performance issue is why at work we disallowed usage of refinement using rubocop.

I guess if refined methods are rare enough, maybe it's not worthwhile to allocate "refinement inline cache"? In other words, refined methods are always an IC miss?

Updated by byroot (Jean Boussier) 3 days ago

refined methods are always an IC miss?

I like that idea.

Updated by byroot (Jean Boussier) 3 days ago

The downside however is that I think we'd need to lock the VM to lookup methods? So it doesn't help with making Ractor lock less.

Updated by jhawthorn (John Hawthorn) 3 days ago ยท Edited

tenderlovemaking (Aaron Patterson) wrote in #note-2:

I guess if refined methods are rare enough, maybe it's not worthwhile to allocate "refinement inline cache"? In other words, refined methods are always an IC miss?

I believe that was the case before Ruby 3.3 https://github.com/ruby/ruby/commit/cfd7729ce7a31c8b6ec5dd0e99c67b2932de4732

Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0Like0