Project

General

Profile

Actions

Bug #18452

closed

Dramatic performance regression in Zeitwerk with 3.1

Added by fxn (Xavier Noria) over 2 years ago. Updated over 2 years ago.

Status:
Rejected
Assignee:
-
Target version:
-
[ruby-core:106917]

Description

Hi!

In Ruby 3.1, Zeitwerk loads implicit namespaces way slower than in previous versions. In this benchmark:

require 'fileutils'
require 'tmpdir'
require 'zeitwerk'

Dir.mktmpdir do |dir|
  Dir.chdir(dir)

  for i in 'a'..'z'
    for j in 'a'..'z'
      FileUtils.mkdir_p("lib/#{i}/#{j}")
    end
  end

  loader = Zeitwerk::Loader.new
  loader.push_dir('lib')
  loader.setup

  start = Process.clock_gettime(Process::CLOCK_MONOTONIC)
  loader.eager_load
  puts Process.clock_gettime(Process::CLOCK_MONOTONIC) - start
end

you go from about 0.06s to +11s. That is like a 180x slow down .

I have seen that the single cause of the slow down is one line: Zeitwerk's Kernel#require wrapper pushing to $LOADED_FEATURES when it intercepts a directory source code. For context, Zeitwerk sets autoloads for directories that define namespaces, so that their corresponding modules get autovivified when first referenced by the user.

I have written a self-contained script that reproduces the root of the issue without Zeitwerk, in case that helps debugging:

module Kernel
  alias original_require require

  def require(path)
    if path.start_with?('/foo')
      $LOADED_FEATURES << path
      true
    else
      original_require(path)
    end
  end
end

foos = []
('a'..'z').each do |i|
  ('a'..'z').each do |j|
    foos << "/foo#{i}#{j}"
  end
end

start = Process.clock_gettime(Process::CLOCK_MONOTONIC)
foos.each { |foo| require foo }
puts Process.clock_gettime(Process::CLOCK_MONOTONIC) - start

I have seen that this commit is responsible for the majority of the regression, But it is not the only cause, before that commit, the script is faster, but not as fast by an important factor too.


Related issues 1 (0 open1 closed)

Related to Ruby master - Bug #17885: require_relative and require should be compatible with each other when symlinked files are usedClosedActions

Updated by jeremyevans0 (Jeremy Evans) over 2 years ago

The bug fixed in 79a4484a072e9769b603e7b4fbdb15b1d7eccb15 is always going to result in some slow down, since Ruby now has to determine the realpath for all entries in $LOADED_FEATURES. The commit doesn't change when the loaded feature index is rebuilt, though it makes the rebuilding process slower as the realpath is recalculated for all entries in $LOADED_FEATURES.

$LOADED_FEATURES is supposed to have files that ruby has required, it's not supposed to contain directories. So any code that modifies $LOADED_FEATURES to add a directory is doing something wrong, in my opinion. Is there a reason Zeitwerk is adding directories to $LOADED_FEATURES, instead of using its own data structure? I assume if you are overriding require, you can check your own data structure instead of abusing $LOADED_FEATURES.

FWIW, I tried the reproducer without Zeitwerk, and it performed about the same in both 3.0 and 3.1:

$ ruby30 -v t/t33.rb
ruby 3.0.3p157 (2021-11-24 revision 3fb7d2cadc) [x86_64-openbsd]
0.00035451585426926613
$ ruby31 -v t/t33.rb
ruby 3.1.0p0 (2021-12-25 revision fb4df44d16) [x86_64-openbsd]
0.00035237614065408707

I'm not sure why the code in your non-Zeitwerk example would perform differently, as the index rebuild only happens when get_loaded_features_index is called (which is called by Kernel#require in ruby code) is called, it's not called just by modifying $LOADED_FEATURES, and your non-Zeitwerk example doesn't call Kernel#require, it only modifies $LOADED_FEATURES.

If I modify the script slightly, to add a require 'set' to the code being timed, then I do see a difference, it is a little less than 3x slower:

$ ruby30 -v t/t33.rb
ruby 3.0.3p157 (2021-11-24 revision 3fb7d2cadc) [x86_64-openbsd]
0.005338589195162058
$ ruby31 -v t/t33.rb
ruby 3.1.0p0 (2021-12-25 revision fb4df44d16) [x86_64-openbsd]
0.01448665326461196

Considering it has to determine the realpath on over 65536 entries in $LOADED_FEATURES, that seems reasonable. My only guess as to what is happening with Zeitwerk is Zeitwerk is requiring an loaded feature index rebuild every time it is modifying $LOADED_FEATURES, resulting in quadratic behavior.

Updated by fxn (Xavier Noria) over 2 years ago

Is there a reason Zeitwerk is adding directories to $LOADED_FEATURES, instead of using its own data structure?

It didn't do that since it first shipped. The lib intercepted the require and did its thing without invoking the original method, and without storing the argument in $LOADED_FEATURES. It hijacked the call and nobody noticed. This has worked fine since forever.

But last week I got a report about some namespaces not loading. The reason for that was that the project loaded Zeitwerk first, and Bootsnap later. Both decorate require. Bootsnap assumes a successful require puts its argument in $LOADED_FEATURES and caches the directory internally. And that prevented reload, because $LOADED_FEATURES did not have the directory to clean it up, the cache of Bootsnap became out of sync, and Bootsnap returned false for subsequent require calls right away, Zeitwerk's was not even called.

I thought about different options, and believed that Bootsnap made a reasonable assumption, so just a few days ago I published a new version that pushed the directory to $LOADED_FEATURES.

You are right about directories being unexpected in that collection, see my comment in the original issue.

You are also right about the self-contained script. I missed some scientific notation and misread some numbers. These are the figures:

BEFORE 79a4484a07

% ./ruby -v ~/tmp/foo.rb
ruby 3.1.0dev (2021-10-02T14:51:29Z v3_1_0_preview1~1463 3f7da458a7) [x86_64-darwin21]
/Users/fxn/prj/zeitwerk/lib/zeitwerk/kernel.rb:24: warning: method redefined; discarding old require
Zeitwerk: 0.28963999999905354
Self-contained: 0.000389000000723172

AFTER 79a4484a07

% ./ruby -v ~/tmp/foo.rb
ruby 3.1.0dev (2021-10-02T14:51:29Z v3_1_0_preview1~1462 79a4484a07) [x86_64-darwin21]
/Users/fxn/prj/zeitwerk/lib/zeitwerk/kernel.rb:24: warning: method redefined; discarding old require
Zeitwerk: 11.44435700000031
Self-contained: 9.200000022246968e-05

As you see, the reported impact happens in the script using Zeitwerk.

Every require call pushes to $LOADED_FEATURES, and if I comment out the line that pushes to $LOADED_FEATURES in Zeitwerk performance is not affected. Why is the lib's push more costly?

Updated by fxn (Xavier Noria) over 2 years ago

BTW, Zeitwerk does not call the original require either. But it is a fact that if you comment out the push call, the regression disappears.

Actions #4

Updated by mame (Yusuke Endoh) over 2 years ago

  • Related to Bug #17885: require_relative and require should be compatible with each other when symlinked files are used added

Updated by jeremyevans0 (Jeremy Evans) over 2 years ago

fxn (Xavier Noria) wrote in #note-2:

I thought about different options, and believed that Bootsnap made a reasonable assumption, so just a few days ago I published a new version that pushed the directory to $LOADED_FEATURES.

You are right about directories being unexpected in that collection, see my comment in the original issue.

I think your original assumption is more reasonable, and Bootsnap's assumption is not. I'm also not sure what code is expecting a Kernel#require with a directory to work, but such code also seems suspect, IMO. If you've only made this change to Zeitwerk in the last week, I recommend reverting it.

Every require call pushes to $LOADED_FEATURES, and if I comment out the line that pushes to $LOADED_FEATURES in Zeitwerk performance is not affected. Why is the lib's push more costly?

When you modify $LOADED_FEATURES directly, you trigger a rebuild of the loaded feature index the next time Kernel#require (or related C-API method) is called. This has been the case for many years, it's just that index rebuilds are more expensive now due to realpath calculations. In general, it's best to never modify $LOADED_FEATURES unless you must (such as to remove entries to allow for code reloading).

Updated by fxn (Xavier Noria) over 2 years ago

I think your original assumption is more reasonable, and Bootsnap's assumption is not.

OK, thanks for the hint.

I'm also not sure what code is expecting a Kernel#require with a directory to work, but such code also seems suspect, IMO. If you've only made this change to Zeitwerk in the last week, I recommend reverting it.

In projects managed by Zeitwerk, subdirectories represent namespaces. If the namespace has a file defining it, all is normal, but if there is no explicit file, the library is expected to create a dummy module object and assign it to the corresponding constant.

Now, there's a general expectation in Zeitwerk that says that things are loaded on demand, when referenced. In order to do so and be symmetric with everything else, Zeitwerk defines an autoload on the directory. That is, if you have

app/controllers/admin

and no admin.rb anywhere, it executes autoload :Admin, "full/path/to/app/controllers/admin". Then, the require wrapper intercepts that call, autovivifies, does housekeeping, and does not call the original method. The autoload is hijacked.

And this has worked flawlessly since it first shipped about two years ago. Pushing to $LOADED_FEATURES is something new that has been out only for a few days.

Thanks for the explanation on the index being rebuilt!

I think we can close Jeremy, I'll follow your advice and revert. Thanks for your help.

Actions #7

Updated by jeremyevans0 (Jeremy Evans) over 2 years ago

  • Status changed from Open to Rejected
Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0Like0Like0Like0