Project

General

Profile

Actions

Bug #10902

closed

require("enumerator") scans LOAD_PATH 2x on every invocation

Added by tmm1 (Aman Gupta) over 6 years ago. Updated 3 months ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Target version:
-
ruby -v:
ruby 2.1.5
[ruby-core:68290]
Tags:

Description

On every invocation of require "enumerator" (for example during boot when many gems require it), the VM will scan the load path twice: once for enumerator.rb and again for enumerator.so. Of course, no file is found because enumerator is now shipped within the VM by default.

$ ruby -e' p $LOADED_FEATURES[0] '
"enumerator.so"

$ ruby -e' p $LOAD_PATH.size '
8

$ strace -e trace=open ruby -e' 1.times{ require "enumerator" } ' 2>&1 | grep enumerator.rb | wc -l
8

$ strace -e trace=open ruby -e' 1.times{ require "enumerator" } ' 2>&1 | grep enumerator.so | wc -l
8

$ strace -e trace=open ruby -e' 10.times{ require "enumerator" } ' 2>&1 | grep enumerator.so | wc -l
80

$ strace -e trace=open ruby -e' 100.times{ require "enumerator" } ' 2>&1 | grep enumerator.so | wc -l
800

In enumerator.c, we call rb_provide("enumerator.so") which adds it to $LOADED_FEATURES. This means require "enumerator.so" can be optimized, but most libraries do not include the .so extension when requiring enumerator.

Updated by tmm1 (Aman Gupta) over 6 years ago

This gets much worse as $LOAD_PATH grows. For example in our app $LOAD_PATH.size is 351, causing 702 failed open() syscalls per require.

Updated by tmm1 (Aman Gupta) over 6 years ago

Having a hard time coming up with a clean patch here. The following works but it's pretty hacky.

diff --git a/load.c b/load.c
index fa225fa..68d15e7 100644
--- a/load.c
+++ b/load.c
@@ -952,6 +952,9 @@ rb_require_safe(VALUE fname, int safe)
     } volatile saved;
     char *volatile ftptr = 0;

+    if (strcmp(RSTRING_PTR(fname), "enumerator") == 0)
+   return Qfalse;
+
     if (RUBY_DTRACE_REQUIRE_ENTRY_ENABLED()) {
    RUBY_DTRACE_REQUIRE_ENTRY(StringValuePtr(fname),
                  rb_sourcefile(),

Updated by normalperson (Eric Wong) over 6 years ago

I tried to fix it int load.c, but failed rubyspec:
http://80x24.org/spew/m/1436044472-18990-1-git-send-email-e@80x24.org.txt

However, this one-liner seems to work:

--- a/enumerator.c
+++ b/enumerator.c
@@ -2060,6 +2060,7 @@ InitVM_Enumerator(void)
     rb_define_method(rb_cYielder, "yield", yielder_yield, -2);
     rb_define_method(rb_cYielder, "<<", yielder_yield_push, -2);

+    rb_provide("enumerator.rb");
     rb_provide("enumerator.so");   /* for backward compatibility */
 }

I'm not sure if we need something similar for complex and rational,
not sure if they were really used via require in the past...

http://80x24.org/spew/m/1436044803-31588-1-git-send-email-e@80x24.org.txt

Updated by normalperson (Eric Wong) over 6 years ago

Ping? https://bugs.ruby-lang.org/issues/10902
I'd like to commit my second patch soon since this scan bothers me.

Actions #5

Updated by nobu (Nobuyoshi Nakada) over 6 years ago

It will disallow a library named "enumerator.rb".

Updated by jeremyevans0 (Jeremy Evans) 3 months ago

If we want to allow loading an enumerator.rb library, then we cannot avoid the scan of the load path for *.rb files. However, we can avoid scanning the load path twice, once for enumerator.rb and once for enumerator.so, since we already know we loaded enumerator.so. I've submitted a pull request to do that: https://github.com/ruby/ruby/pull/4687

Updated by nobu (Nobuyoshi Nakada) 3 months ago

  • Backport changed from 2.0.0: UNKNOWN, 2.1: UNKNOWN, 2.2: UNKNOWN to 2.7: REQUIRED, 3.0: REQUIRED

Seems fine.

Actions #8

Updated by jeremyevans (Jeremy Evans) 3 months ago

  • Status changed from Open to Closed

Applied in changeset git|345db8f2aa373a31c619c8f85bd372f0a20829c1.


Avoid pointless attempts to open .so file if already required

When attempting to require a file without an extension that has
already been required or provided with an .so extension, only
look for files with an .rb extension. There is no point in
trying to find files with an .so extension, since we already
know one has been loaded.

Previously, attempting to require such a file scanned the load
path twice, once for .rb and once for .so. Now it only scans
once for .rb. The scan once for .rb cannot be avoided, since
the .rb file would take precedence and should be loaded if it
exists.

Fixes [Bug #10902]

Updated by Eregon (Benoit Daloze) 3 months ago

IMHO it would be fine to simply not allow loading any foo.rb if foo.so is already loaded, a change of semantics which would gain a lot of simplicity.
Do we have actual use cases for require "foo" to first load foo.so and then later on another require "foo" to load foo.rb?
It seems unlikely to be intentional behavior.
I argue users should at least specify the extension in both cases if they expect that.

Updated by nagachika (Tomoyuki Chikanaga) 3 months ago

  • Backport changed from 2.7: REQUIRED, 3.0: REQUIRED to 2.7: REQUIRED, 3.0: DONE

ruby_3_0 6f4ab641bb1035c5811e42e43320112e4a502a0e merged revision(s) 345db8f2aa373a31c619c8f85bd372f0a20829c1.

Actions

Also available in: Atom PDF