Bug #10902
closedrequire("enumerator") scans LOAD_PATH 2x on every invocation
Description
On every invocation of require "enumerator"
(for example during boot when many gems require it), the VM will scan the load path twice: once for enumerator.rb and again for enumerator.so. Of course, no file is found because enumerator is now shipped within the VM by default.
$ ruby -e' p $LOADED_FEATURES[0] '
"enumerator.so"
$ ruby -e' p $LOAD_PATH.size '
8
$ strace -e trace=open ruby -e' 1.times{ require "enumerator" } ' 2>&1 | grep enumerator.rb | wc -l
8
$ strace -e trace=open ruby -e' 1.times{ require "enumerator" } ' 2>&1 | grep enumerator.so | wc -l
8
$ strace -e trace=open ruby -e' 10.times{ require "enumerator" } ' 2>&1 | grep enumerator.so | wc -l
80
$ strace -e trace=open ruby -e' 100.times{ require "enumerator" } ' 2>&1 | grep enumerator.so | wc -l
800
In enumerator.c, we call rb_provide("enumerator.so")
which adds it to $LOADED_FEATURES. This means require "enumerator.so"
can be optimized, but most libraries do not include the .so extension when requiring enumerator.
Updated by tmm1 (Aman Karmani) over 9 years ago
This gets much worse as $LOAD_PATH grows. For example in our app $LOAD_PATH.size is 351, causing 702 failed open() syscalls per require.
Updated by tmm1 (Aman Karmani) over 9 years ago
Having a hard time coming up with a clean patch here. The following works but it's pretty hacky.
diff --git a/load.c b/load.c
index fa225fa..68d15e7 100644
--- a/load.c
+++ b/load.c
@@ -952,6 +952,9 @@ rb_require_safe(VALUE fname, int safe)
} volatile saved;
char *volatile ftptr = 0;
+ if (strcmp(RSTRING_PTR(fname), "enumerator") == 0)
+ return Qfalse;
+
if (RUBY_DTRACE_REQUIRE_ENTRY_ENABLED()) {
RUBY_DTRACE_REQUIRE_ENTRY(StringValuePtr(fname),
rb_sourcefile(),
Updated by normalperson (Eric Wong) over 9 years ago
I tried to fix it int load.c, but failed rubyspec:
http://80x24.org/spew/m/1436044472-18990-1-git-send-email-e@80x24.org.txt
However, this one-liner seems to work:
--- a/enumerator.c
+++ b/enumerator.c
@@ -2060,6 +2060,7 @@ InitVM_Enumerator(void)
rb_define_method(rb_cYielder, "yield", yielder_yield, -2);
rb_define_method(rb_cYielder, "<<", yielder_yield_push, -2);
+ rb_provide("enumerator.rb");
rb_provide("enumerator.so"); /* for backward compatibility */
}
I'm not sure if we need something similar for complex and rational,
not sure if they were really used via require in the past...
http://80x24.org/spew/m/1436044803-31588-1-git-send-email-e@80x24.org.txt
Updated by normalperson (Eric Wong) over 9 years ago
Ping? https://bugs.ruby-lang.org/issues/10902
I'd like to commit my second patch soon since this scan bothers me.
Updated by nobu (Nobuyoshi Nakada) over 9 years ago
It will disallow a library named "enumerator.rb".
Updated by jeremyevans0 (Jeremy Evans) over 3 years ago
If we want to allow loading an enumerator.rb
library, then we cannot avoid the scan of the load path for *.rb files. However, we can avoid scanning the load path twice, once for enumerator.rb
and once for enumerator.so
, since we already know we loaded enumerator.so
. I've submitted a pull request to do that: https://github.com/ruby/ruby/pull/4687
Updated by nobu (Nobuyoshi Nakada) over 3 years ago
- Backport changed from 2.0.0: UNKNOWN, 2.1: UNKNOWN, 2.2: UNKNOWN to 2.7: REQUIRED, 3.0: REQUIRED
Seems fine.
Updated by jeremyevans (Jeremy Evans) over 3 years ago
- Status changed from Open to Closed
Applied in changeset git|345db8f2aa373a31c619c8f85bd372f0a20829c1.
Avoid pointless attempts to open .so file if already required
When attempting to require a file without an extension that has
already been required or provided with an .so extension, only
look for files with an .rb extension. There is no point in
trying to find files with an .so extension, since we already
know one has been loaded.
Previously, attempting to require such a file scanned the load
path twice, once for .rb and once for .so. Now it only scans
once for .rb. The scan once for .rb cannot be avoided, since
the .rb file would take precedence and should be loaded if it
exists.
Fixes [Bug #10902]
Updated by Eregon (Benoit Daloze) over 3 years ago
IMHO it would be fine to simply not allow loading any foo.rb
if foo.so
is already loaded, a change of semantics which would gain a lot of simplicity.
Do we have actual use cases for require "foo"
to first load foo.so
and then later on another require "foo"
to load foo.rb
?
It seems unlikely to be intentional behavior.
I argue users should at least specify the extension in both cases if they expect that.
Updated by nagachika (Tomoyuki Chikanaga) over 3 years ago
- Backport changed from 2.7: REQUIRED, 3.0: REQUIRED to 2.7: REQUIRED, 3.0: DONE
ruby_3_0 6f4ab641bb1035c5811e42e43320112e4a502a0e merged revision(s) 345db8f2aa373a31c619c8f85bd372f0a20829c1.
Updated by usa (Usaku NAKAMURA) almost 3 years ago
- Backport changed from 2.7: REQUIRED, 3.0: DONE to 2.7: DONE, 3.0: DONE
ruby_2_7 f236548038048118e766232034511e4877a59b49 merged revision(s) 345db8f2aa373a31c619c8f85bd372f0a20829c1.