Project

General

Profile

Actions

Bug #20188

open

`Module#const_source_location` returns wrong information when real constant was defined but autoload is still ongoing

Added by byroot (Jean Boussier) about 1 month ago. Updated 24 days ago.

Status:
Open
Assignee:
-
Target version:
-
ruby -v:
ruby 3.3.0 (2023-12-25 revision 5124f9ac75) [arm64-darwin23]
[ruby-core:116223]

Description

Ref: https://github.com/fxn/zeitwerk/issues/281

const_source_location keeps returning the location of the autoload call, even after the real constant was defined. It only starts returning the real constant location once the autoload was fully completed.

# /tmp/autoload.rb
File.write("/tmp/const.rb", <<~RUBY)
module Const
  LOCATION = Object.const_source_location(:Const)
end
RUBY

autoload :Const, "/tmp/const"

p Const::LOCATION

Expected Output:

["/tmp/const.rb", 1]

Actual Output:

["/tmp/autoload.rb", 8]

Potential patch: https://github.com/ruby/ruby/pull/9549

Updated by Eregon (Benoit Daloze) about 1 month ago

Maybe the idea was because of autoload thread-safety the value is not published to other threads until the autoload completes, and so maybe const_source_location was done that way too.
I don't see the need to delay that though as it should not matter for thread-safety, as long as if the assignment happens for the autoload constant it cannot be undone and so the location will be the same once the autoload is completed (even with an exception after the assignment).
It would be good to check the current behavior of const_source_location if an exception happens after the assignment but in the same file of the autoload.

Updated by fxn (Xavier Noria) about 1 month ago

I tried, the current behavior does not make a lot of sense to me:

File.write('/tmp/bar.rb', 'Bar = 1; raise')
autoload :Bar, '/tmp/bar'
Bar rescue nil

p Object.const_source_location(:Bar) # ["foo.rb", 2]
p Object.autoload?(:Bar) # "/tmp/bar"
p Bar # raises

In Ruby, an autoload is a trigger, you are not even required to define the constant in the receiver. The autoload was triggered, therefore, in my view it should be gone. I would expect:

p Object.const_source_location(:Bar) # ["/tmp/bar.rb", 1]
p Object.autoload?(:Bar) # nil
p Bar # 1

Regarding the first sentence, here's an autoload that does not define a constant. It is just a trigger:

module M
  Bar = 1
end

class Object
  include M
end

File.write('/tmp/bar.rb', '')
autoload :Bar, '/tmp/bar'
Bar

p Object.const_source_location(:Bar) # ["foo.rb", 2]
p Object.autoload?(:Bar) # nil
p Bar # 1

Could it be the case that the internal "housekeeping" just does not get a chance because the exception bubbles up?

Updated by fxn (Xavier Noria) about 1 month ago

(BTW, to me, the fact that an autoload is just a trigger and does not require the constant to be defined in its receiver is dubious too. But that would be for a different thread.)

Updated by Eregon (Benoit Daloze) about 1 month ago

p Object.const_source_location(:Bar) # ["/tmp/bar.rb", 1]

I think Module#const_source_location should not trigger the autoload, at least such a change seems out of scope of this issue.

Module#const_source_location should return the location of the autoload before the autoload happens and the location of setting the constant after the autoload.
The "during the autoload, after the constant was set" is this issue. It seems fine to me to set it then and not delay that until the autoload is done.

Updated by fxn (Xavier Noria) about 1 month ago

I think Module#const_source_location should not trigger the autoload, at least such a change seems out of scope of this issue.

Oh, was not proposing that.

That output was meant to be what I'd expect from the previous example (which triggers the autoload). The whole program would be:

File.write('/tmp/bar.rb', 'Bar = 1; raise')
autoload :Bar, '/tmp/bar'
Bar rescue nil

# What I believe makes sense, but it is not what happens today.
p Object.const_source_location(:Bar) # ["/tmp/bar.rb", 1]
p Object.autoload?(:Bar) # nil
p Bar # 1

Updated by fxn (Xavier Noria) about 1 month ago

The general principle at place here would be: Ruby does not undo side-effects of code evaluation. If you defined a global, mutated an object, defined constants, ..., any change visible to the caller that happens before the exception is raised remains.

In this case, the constant was obviously defined, the side-effect should stay for consistency with that generic rule.

Updated by Eregon (Benoit Daloze) about 1 month ago

fxn (Xavier Noria) wrote in #note-5:

That output was meant to be what I'd expect from the previous example (which triggers the autoload). The whole program would be:

That's how I understood it, but I somehow missed the Bar rescue nil when reading the code.

So the current behavior seems to be that if the autoload fails then the autoload remains and keeps failing (and it even re-evaluates the file each time).
The Bar = 1 in /tmp/bar.rb is never published because the autoload raised, so effectively after the autoload returned by failing it is the same as if that constant was never set.
And in that logic, it does make sense that const_source_location still points at the autoload line, because in terms of state Bar has never been "assigned publicly".
By "assigned publicly" I mean the value is no longer private to the autoload but published to all.

So with these semantics that an exception during autoload does not publish the constant, the current behavior for const_source_location seems better.
Changing it would make const_source_location and autoload? inconsistent, like:

File.write('/tmp/bar.rb', 'Bar = 1; raise')
autoload :Bar, '/tmp/bar'
Bar rescue nil

p Object.const_source_location(:Bar) # ["/tmp/bar.rb", 1] with this change, ["foo.rb", 2] before. ["foo.rb", 2] is better because there is no constant Bar set and the assignment during the failed autoload was never published.
p Object.autoload?(:Bar) # "/tmp/bar" this means the autoload is still there and will be used for the next reference to Bar (just below)
p Bar # raises

Regarding your expected output, yeah it makes sense if the constant was published (if present) even if the autoload raised an exception.
I'm not sure why autoload behaves that way on exception and does not publish (possibly this was discussed before and I forgot, it does seem better than the older behavior of special "undefined constants" on failed autoload).
It does feel like something that would be worth to discuss and try to change.

Updated by fxn (Xavier Noria) about 1 month ago

@Eregon (Benoit Daloze) I see what you mean, but you can't make that consistent, because by that logic

Bar = 1
p Bar

should not print 1, right? The constant was created, the lookup for Bar found it, therefore its source location is not the autoload. The same way its value is no longer unknown.

Updated by fxn (Xavier Noria) about 1 month ago

To me, this should be the same logic of Kernel#require. Whatever side-effects happened while loading the file remain. Like, partially defined classes, for example.

Module#autoload is just a deferred Kernel#require for me (even more taking into account there is no expectation about its side-effects). If you triggered it, done.

It has the additional feature of ensuring other constant references to the same constant wait while the autoload is happening. But if the autoload raised, the ones waiting would return the constant if defined, or NameError otherwise (generic logic for constants).

Updated by Eregon (Benoit Daloze) about 1 month ago

fxn (Xavier Noria) wrote in #note-8:

@Eregon (Benoit Daloze) I see what you mean, but you can't make that consistent, because by that logic

Bar = 1
p Bar

should not print 1, right? The constant was created, the lookup for Bar found it, therefore its source location is not the autoload. The same way its value is no longer unknown.

I assumed you mean that code would be in the autoloaded file (/tmp/bar.rb), and there could still be a raise after it.
Yes it's true that the thread autoloading sees Bar as "defined" while it's not published globally yet. That's necessary though.

But then is it better to be consistent for that one thread only during the autoload, or consistent after the autoload?
I'd say the second matters more, the state during the autoload for the thread doing the autoload is always a bit weird and special, and it needs to be (e.g. defined?(Bar) would be nil there).
After the autoload is done (exception or not), I believe const_source_location and autoload? should be consistent (either the autoload is still there or it's not, not a mix according to which method), and this change would break that (for the edge case of assigning the constant + exception after).

Updated by Eregon (Benoit Daloze) about 1 month ago

On a related note IIRC the recent-ish change was on failed autoload, the autoload and its constant would be completely removed: https://bugs.ruby-lang.org/issues/15790#note-12
But it seems to not be the case for this edge case, and also not if we remove the Bar = 1.
It seems the behavior already changed again since then.

Updated by fxn (Xavier Noria) about 1 month ago

@Eregon (Benoit Daloze) I feel we need to recenter the discussion to consider const_source_location under the current logic that undoes some things if the autoload raises.

To me, there is no doubt const_source_location in the code being autoloaded has to return the real location just defined. Because that is coherent with the rest of the local state:

  • The lookup finds the constant.
  • defined?(Bar) returns "constant".
  • IMPORTANT: autoload?(:Bar) returns nil.
  • constants includes the constant.
  • Etc.

So, the runtime is locally consistent, except for const_source_location. That one is off compared to the rest.

If an exception is raised, the state of the constant is kind of restored. By the same reasoning, you'd restore const_source_location if you will, like autoload? goes back to returning the string with the file name again (but you got nil before).

Updated by Eregon (Benoit Daloze) about 1 month ago

One problem though with that restore way is, I think, const_source_location is always the same for all threads.
What you are showing there is only valid for the thread autoloading that file, but is not the case for any other thread, which effectively considers the autoload is not done yet (and from their POV not even started yet).
So for another thread, it seems bad to have const_source_location switching from autoload location to temporary assignment location to autoload location (notably it creates many state transitions instead of just 2).
Hence I think the current logic is correct and consistent.
And I think if we want to change this then we have the change the general behavior of "constant was set + exception after, during an autoload".

Updated by Eregon (Benoit Daloze) about 1 month ago

Alternatively if const_source_location returns a different result based on the calling thread then I agree it seems fine in the autoloading thread to return the temporary assignment location just after that assignment.
And then after the autoload, it's consistent for both the was-autoloading thread and other thread: either the autoload location if the autoload raised, or the assignment location if the autoload succeeded.

Updated by byroot (Jean Boussier) about 1 month ago

Ok, so I see what you mean. Accessing Const from another thread while the autoload is still being resolve blocks until it's done, which makes sense for thread safety, and I suspect my patch breaks that, I need to double check and fix it.

However I don't agree it means accessing the const location should continue to return the autoload location for other threads. What is not thread safe would be to use the constant that is being defined, but it's location can be changed atomically, so I see no reason to delay it.

Updated by byroot (Jean Boussier) about 1 month ago

Alright, I guess I better see what you mean now, there is a spec for defined? during autoload, that ensure other threads see the constant as not defined until the autoload completed:

    it "defined? returns nil in autoload thread and 'constant' otherwise for defined?" do
      results = check_before_during_thread_after {
        defined?(ModuleSpecs::Autoload::DuringAutoload)
      }
      results.should == ['constant', nil, 'constant', 'constant']
    end

I wouldn't have specified like that, but given it's the current behavior, it would make sense for const_source_location to be consistent with that. I'll see if I can make it work like this in my PR.

Updated by fxn (Xavier Noria) about 1 month ago

Are you guys sure defined? depends on the thread? In this test running in 3.3 I don't seem to reproduce:

$up = Queue.new
$down = Queue.new

File.write('/tmp/bar.rb', <<~RUBY)
  Bar = 1

  $up << true
  $down.pop

  p defined?(Bar)
RUBY

autoload :Bar, '/tmp/bar.rb'

t = Thread.new { Bar }

$up.pop
p defined?(Bar)

$down << true
t.join

I get two "constant"s printed.

Updated by Eregon (Benoit Daloze) about 1 month ago

@fxn (Xavier Noria) The defined? should be before the assignment in the autoloading thread.

Updated by fxn (Xavier Noria) about 1 month ago

Ah, I see.

So that, to me, stll reinforces my main design principle, that I have not shared so far in the discussion.

The code responsible for loading is the caller, bar.rb should work the same no matter how the require was issued. And that is why I am in favor of updating const_source_location. And I also agree with you that with the current logic, it should be restored on failure.

Updated by byroot (Jean Boussier) about 1 month ago

I opened a PR with some extra specs for the behavior after the constant is defined, but before the autoload is completed: https://github.com/ruby/ruby/pull/9613

And I updated my patch to be consistent with that behavior: https://github.com/ruby/ruby/pull/9549

Updated by Eregon (Benoit Daloze) about 1 month ago

fxn (Xavier Noria) wrote in #note-19:

The code responsible for loading is the caller, bar.rb should work the same no matter how the require was issued.

I agree.

And that is why I am in favor of updating const_source_location. And I also agree with you that with the current logic, it should be restored on failure.

Updating seems confusing here because it might imply it affects other threads too.
But it should only affect thread doing the autoload.

From https://github.com/ruby/ruby/pull/9549#issuecomment-1903680909 these are the desired semantics which also avoids other threads seeing the value change back and forth, and it also means there is no need to "restore":

To clarify this changes the behavior of const_source_location in the thread doing the autoload (during the autoload) but not for other threads, correct?
i.e. the (real) constant's location is only visible to other threads after the autoload finished, i.e., at the same time the constant is published to these other threads.

Updated by fxn (Xavier Noria) about 1 month ago

@Eregon (Benoit Daloze) Yeah, I was not addressing thread semantics in that comment, same as autoload?.

BTW, I need to understand how all this constant metadata logic/API works with fibers, because we always talk about threads. Specially given that fiber schedulers remove the control from the user.

Updated by byroot (Jean Boussier) about 1 month ago

I need to understand how all this constant metadata logic/API works with fibers, because we always talk about threads

You can substitute thread for fiber in all that discussion. Each autoload has an associated mutex that is held by the fiber that is triggering the autoload until require completes.

Updated by fxn (Xavier Noria) 27 days ago

While in threads Ruby controls context switching and coordination for autoloading constants or loading files with Kernel#require is built-in, the problem I see with fibers is that Ruby has no control, the user has the control by design.

For example, consider this script:

File.write('/tmp/bar.rb', <<~RUBY)
  Fiber.yield
  Bar = 1
RUBY

autoload :Bar, '/tmp/bar.rb'

Fiber.new { Bar }.resume
Fiber.new { Bar }.resume

produces a deadlock:

deadlock; lock already owned by another fiber belonging to the same thread (ThreadError)

While yielding like that is artificial, the analogous situation in threads does not deadlock, and does not err either. Threads wait as needed until the one autoloading finishes.

In the abstract, this is also a potential issue I see with the fiber scheduler interface, because it does not establish a contract for concurrent autoloads or requires. If a fiber does a non-blocking operation while the file is being loaded, and that was part of an autoload or require, what can we assume about that situation?

Updated by byroot (Jean Boussier) 26 days ago

If a fiber does a non-blocking operation while the file is being loaded, and that was part of an autoload or require, what can we assume about that situation?

I don't think it's anything particular to autoload.

autoload simply add an implicit Mutex around the initial constant access, and as you noted, fibers being cooperative, a mistake can lead to a deadlock. But it's not specific to autoload, any other mutex would cause the same issue, and I can't think of anything autoload could or should do here.

Updated by fxn (Xavier Noria) 26 days ago

To me, this is related with autoload, because it is the autoload the one making the fibers deadlock. If the last line was

Fiber.new { p 1 }.resume

instead of a reference to the constant being autoloaded, there would be no deadlock, right? Same with require, if we try this without autoloads:

File.write('/tmp/bar.rb', <<~RUBY)
  Fiber.yield
RUBY

Fiber.new { require '/tmp/bar.rb' }.resume
Fiber.new { require '/tmp/bar.rb' }.resume

we get a deadlock.

So that is my point, what we say about threads does not translate to fibers. In fibers, you can deadlock.

Kernel#require is thread-safe, things that would typically trigger a context switch (like I/O or sleep) won't make you deadlock if the other threads are waiting for that require. And same for autoload. And makes sense, because Ruby owns the thread schedulur mod hints API.

fibers being cooperative

That is less the case with fiber schedulers, no?

Can Falcon trigger deadlocks?

Updated by byroot (Jean Boussier) 26 days ago

That is less the case with fiber schedulers, no?

It's still cooperative. You implicitly yield on IOs sure, but still won't be prempted after running for too long.

Can Falcon trigger deadlocks?

I don't see why not, but I may be missing something. I'm not that well versed in the fiber scheduler.

Updated by fxn (Xavier Noria) 26 days ago

So, we are arriving at the concern I expressed above in the first message of this "subthread".

Since fiber schedulers have no contract related to require or autoload, my understanding is that these methods are thread-safe and not fiber-safe. You could make a HTTP call at the top level of the file, and enter a deadlock.

To this day, I don't have certainty that Falcon can safely run Rails in develoment mode. Not saying it cannot, because my knowledge is limited, only that if that is the case, I don't know which argument explains it.

Updated by fxn (Xavier Noria) 26 days ago

I'll open a separate issue for this :). If require is fiber-safe I'd like to know and maybe improve its docs. If it is not, that may be important.

Updated by fxn (Xavier Noria) 26 days ago

I have tried several things using require and autoload with the test scheduler and cannot get a deadlock by now.

Actions

Also available in: Atom PDF

Like1
Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0