Project

General

Profile

Actions

Bug #15790

closed

Strange interaction between autoload and $LOADED_FEATURES

Added by fxn (Xavier Noria) almost 5 years ago. Updated almost 2 years ago.

Status:
Closed
Assignee:
-
Target version:
-
ruby -v:
ruby 2.6.1p33 (2019-01-30 revision 66950) [x86_64-darwin18]
[ruby-core:92403]

Description

If an autoload fails and we remove its associated file from $LOADED_FEATURES, the autoload is back:

$ cat x.rb
Y = 1 # should be X, emulates a typo

$ cat test.rb
def au
  Object.autoload?(:X).inspect
end

x_rb = File.realpath("x.rb")
autoload :X, x_rb

puts "before failed autoload autoload path is #{au}"

X rescue nil

puts "after failed autoload autoload path is #{au}"

$LOADED_FEATURES.delete(x_rb)

puts "after $LOADED_FEATURES deletion autoload path is #{au}"

The output is

$ ruby -v test.rb
ruby 2.6.1p33 (2019-01-30 revision 66950) [x86_64-darwin18]
before failed autoload autoload path is "/Users/fxn/tmp/x.rb"
after failed autoload autoload path is nil
after $LOADED_FEATURES deletion autoload path is "/Users/fxn/tmp/x.rb"

See? Last line would be expected to print a nil autoload path.

Actions #1

Updated by hsbt (Hiroshi SHIBATA) about 4 years ago

  • Tags set to core, autoload

Updated by jeremyevans0 (Jeremy Evans) over 2 years ago

This is still an issue in the master branch. One way to fix this is to delete the autoload if the constant isn't present after the file is loaded. I submitted a pull request for that: https://github.com/ruby/ruby/pull/4715

Updated by Eregon (Benoit Daloze) over 2 years ago

Maybe we should remove the constant as well if we remove the autoload?
I think that would be natural and simpler semantically.
Currently there is some really strange semantics around that, which lead to a lot of complexity (e.g., the concept of "undefined constants":
https://github.com/ruby/spec/blob/e60485b648924cffa3bc48578000fe667a7b6a9e/core/module/autoload_spec.rb#L444-L475

On a related note, currently when requiring a file fails, the autoload is not removed:
https://github.com/ruby/spec/blob/e60485b648924cffa3bc48578000fe667a7b6a9e/core/module/autoload_spec.rb#L408-L442
I think it would be nice if those 2 cases are consistent (i.e., handle any exception from inside autoload the same way).

Updated by fxn (Xavier Noria) over 2 years ago

I deleted my previous comment, misunderstood the situation.

Yeah, in my view if you fire an autoload, the autoload should be removed regardless of the outcome. Additionally, if the constant has not been defined, it should be removed (as undefined) too.

Updated by fxn (Xavier Noria) over 2 years ago

Indeed, the concept of undefined constants for me is dubious. An autoload claims that the constant should be in the given file, but you don't know until the file is evaluated. So, it has been always surprising to me that the constants API acts as if the constant was actually defined for real.

Updated by jeremyevans0 (Jeremy Evans) over 2 years ago

I updated the pull request to remove the constant instead of just deleting the autoload. If it passes CI, I'm fine with that approach.

Updated by jeremyevans0 (Jeremy Evans) over 2 years ago

jeremyevans0 (Jeremy Evans) wrote in #note-6:

I updated the pull request to remove the constant instead of just deleting the autoload. If it passes CI, I'm fine with that approach.

Removing the constant breaks 4 specs:

ruby 3.1.0dev (2021-08-06T20:59:12Z autoload-delete-af.. 63f803ae45) [x86_64-openbsd6.9]
                                                                                             
1)
Module#autoload does not remove the constant from Module#constants if the loaded file does not define it, but leaves it as 'undefined' FAILED
Expected ModuleSpecs::Autoload to have constant 'O' but it does not
/home/jeremy/tmp/ruby/spec/ruby/core/module/autoload_spec.rb:455:in `block (2 levels) in <top (required)>'
/home/jeremy/tmp/ruby/spec/ruby/core/module/autoload_spec.rb:34:in `<top (required)>'
                                                                                             
2)
Module#autoload after autoloading searches for the constant like the original lookup in lexical scopes if declared in parent and defined in current FAILED
Expected ModuleSpecs::Autoload to have constant 'DeclaredInParentDefinedInCurrent'
but it does not
/home/jeremy/tmp/ruby/spec/ruby/core/module/autoload_spec.rb:557:in `<module:Autoload>'
/home/jeremy/tmp/ruby/spec/ruby/core/module/autoload_spec.rb:541:in `block (3 levels) in <top (required)>'
/home/jeremy/tmp/ruby/spec/ruby/core/module/autoload_spec.rb:34:in `<top (required)>'
                                                                                             
3)
Module#autoload after autoloading searches for the constant like the original lookup and fails when finding the undefined autoload constant in the current scope when declared in current and defined in parent FAILED
Expected NameError
but no exception was raised (:declared_in_current_defined_in_parent was returned)
/home/jeremy/tmp/ruby/spec/ruby/core/module/autoload_spec.rb:573:in `<class:LexicalScope>'
/home/jeremy/tmp/ruby/spec/ruby/core/module/autoload_spec.rb:571:in `<module:Autoload>'
/home/jeremy/tmp/ruby/spec/ruby/core/module/autoload_spec.rb:566:in `block (3 levels) in <top (required)>'
/home/jeremy/tmp/ruby/spec/ruby/core/module/autoload_spec.rb:34:in `<top (required)>'

4)
Module#const_set when overwriting an existing constant does not warn if the previous value was undefined FAILED
Expected #<Module:0x0000061748b95cc0> to have constant 'Foo' but it does not
/home/jeremy/tmp/ruby/spec/ruby/core/module/const_set_spec.rb:112:in `block (3 levels) in <top (required)>'
/home/jeremy/tmp/ruby/spec/ruby/core/module/const_set_spec.rb:4:in `<top (required)>'

I pushed a commit to update the specs. Do we consider these specs to be desired behavior or just implementation details? Personally, I strongly dislike having specs for implementation details.

Updated by Eregon (Benoit Daloze) over 2 years ago

Thanks, it's fine to modify these specs.

Their description should still match the tested behavior though,
for "does not remove the constant from Module#constants if the loaded file does not define it, but leaves it as 'undefined'"
since the new behavior is the constant is no longer undefined but actually removes the constant.
So we should duplicate that spec, adapt the description and use version guards.

Do we consider these specs to be desired behavior or just implementation details?

Doesn't seem the best place to discuss this but in short, it is important to have those tests to ensure Ruby implementations are compatible in this regard.
These specs with the version guards are extremely helpful for alternative Ruby implementations to pick that change and understand what changed.

Updated by fxn (Xavier Noria) over 2 years ago

Indeed, to me, this belongs to the contract of these APIs and should be documented.

The documentation of Module#autoload should say what happens if the file is not found, if the file is found but raises, if the file is found and is fine, but does not define the constant. It should also document that a constant for which by now you only have an autoload set is assumed to be a constant of the receiver, etc.

The whole constants API is public interface that needs clearly-defined contracts in the docs. In particular, as you can imagine, Zeitwerk uses these APIs extensively.

Updated by fxn (Xavier Noria) over 2 years ago

BTW, Zeitwerk has its own "Ruby compatibility suite" for language details the implementation relies on. Like, you can override autoloads, remove_const does not execute autoloads, etc.: https://github.com/fxn/zeitwerk/blob/main/test/lib/zeitwerk/test_ruby_compatibility.rb.

Updated by jeremyevans0 (Jeremy Evans) over 2 years ago

Eregon (Benoit Daloze) wrote in #note-8:

Their description should still match the tested behavior though,
for "does not remove the constant from Module#constants if the loaded file does not define it, but leaves it as 'undefined'"
since the new behavior is the constant is no longer undefined but actually removes the constant.
So we should duplicate that spec, adapt the description and use version guards.

OK, I pushed another commit for that.

Actions #12

Updated by jeremyevans (Jeremy Evans) over 2 years ago

  • Status changed from Open to Closed

Applied in changeset git|08759edea8fb75d46c3e75217e6613465426a0d2.


Remove autoload for constant if the autoload fails

Previously, if an autoload failed (the file was loaded, but the
constant was not defined by the autoloaded file). Ruby will try
to autoload again if you delete the autoloaded file from
$LOADED_FEATURES. With this change, the autoload and the
constant itself are removed as soon as it fails.

To handle cases where multiple threads are autoloading, when
deleting an autoload, handle the case where another thread
already deleted it.

Fixes [Bug #15790]

Updated by ioquatix (Samuel Williams) almost 2 years ago

The behaviour described here does not seem strange to me, at least, I'm not sure if it's problematic. @fxn (Xavier Noria) what was the actual issue at the root of the behaviour described here?

Even more challenging is how this PR is implemented, given that there is no locking around rb_const_remove it seems to introduce yet another potential race condition between threads.

Finally, it's also not clear that the proposed PR actually addresses all permutations of the problem.

There is another spec here:

    it "raises a LoadError in each thread if the file does not exist" do
      file = fixture(__FILE__, "autoload_does_not_exist.rb")
      start = false

      threads = Array.new(10) do
        Thread.new do
          Thread.pass until start
          begin
            ModuleSpecs::Autoload.autoload :FileDoesNotExist, file
            Thread.pass
            ModuleSpecs::Autoload::FileDoesNotExist
          rescue LoadError => e
            e
          ensure
            Thread.pass
          end
        end
      end

      start = true
      threads.each { |t|
        t.value.should be_an_instance_of(LoadError)
      }
    end

This spec requires all 10 threads to raise a LoadError. With the current implementation, this leaves the autoload state in place.

We can see the unusual behaviour manifest:

autoload :X, "non_existant.rb"

begin
  X
rescue LoadError => error
  pp error
  # #<LoadError: cannot load such file -- non_existant.rb>
end

pp autoload? :X
# "non_existant.rb"

autoload :Y, "./empty.rb"

begin
  Y
rescue NameError => error
  pp error
  # #<NameError: uninitialized constant Y>
end

pp autoload? :Y
# nil

We now have specs which are effectively at odds with each other, and only differ in very nuanced details which probably shouldn't matter.

I'm not sure whether we either (1) revert the behaviour/PR here or (2) handle all cases where the required feature fails to load.

I think (1) is easier and more consistent. Autoload is state which is separate from the constant. It's a way of loading the constant if it's not loaded. The implementation of (2) is far more complex and tricky to get right. Knowing when to remove the autoload state does not seem trivial to me. There are edge cases which we have to consider like if the thread is terminated during the require (but would have otherwise succeeded). I'm not even sure we can enumerate all these conditions... Should this prevent another thread from successfully requiring the file? It also seems like it's going to be easier to revert this PR than it is to change a long established spec.

Updated by ioquatix (Samuel Williams) almost 2 years ago

Let me add, that I think it's also reasonable sequence of events/state transition:

before failed autoload autoload path is "/Users/fxn/tmp/x.rb"
after failed autoload autoload path is "/Users/fxn/tmp/x.rb"
after $LOADED_FEATURES deletion autoload path is "/Users/fxn/tmp/x.rb"

Updated by ioquatix (Samuel Williams) almost 2 years ago

Another weird edge case we have now:

File.write("./empty.rb", "Y = 1")
autoload :X, "./empty.rb"

begin
  pp X
rescue NameError => error
  pp error
  # #<NameError: uninitialized constant Y>
end

pp autoload? :X
# nil

begin
  pp Y
rescue NameError => error
  pp error
  # #<NameError: uninitialized constant Y>
end

pp autoload? :Y
# nil

Generates:

#<NameError: uninitialized constant X>
nil
1
nil

Basically, autoloading can load completely different constants, even if it fails, and then is no longer present. I'm sure there are other odd things that can happen here - i.e. weird edge cases.

Updated by fxn (Xavier Noria) almost 2 years ago

The behaviour described here does not seem strange to me

What do you mean Samuel?

You don't find strange that we pass from "no autoload for X" to "there's an autoload for X", without having set an autoload for X in between? (No threads involved.)

Updated by Eregon (Benoit Daloze) almost 2 years ago

There is some discussion on https://github.com/ruby/ruby/pull/5910#issuecomment-1128149901 as we well.

I think it'd be great to unify more the behavior when an autoload fail, and specifically avoid those "undefined constants" completely.

So after a failed autoload (for any reason, e.g. the file does not exist, an exception was raised, the constant was not assigned) either:

  • don't change anything, leave the autoload in place. The program should still fail from that exception if well-behaved (but I could also easily see that exception swallowed, since it can happen on any constant access, maybe with some rescue around it). Also the behavior might be weird if e.g. the file loaded fine but did not define the expected constant. Then the file shouldn't be loaded per require semantics. So just NameError in that case since the constant was not defined after require was called.
  • remove the autoload so it's not retried again (it's not great if a given file is partially loaded multiple times, can be quite confusing). Because all threads should wait each other for loading an autoload constant there should be no race in doing that, i.e., if it's done at the same place where we assign temp values from the loaded file into the real constants ("assign all the constants" in the PR terms), i.e., just before releasing the autoload lock for that constant. Other threads should simply recheck the state once they get the enter the lock, whether the constant is now set or no longer exists.
Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0