Misc #20232
openDocument Kernel#require and Module#autoload concurrency guarantees
Description
I'd like to document Kernel#require
and Module#autoload
concurrency guarantees.
In the case of multiple threads loading the same file concurrently, Kernel#require
will succeed in just one of them and the rest will wait and return false. If a constant that has an autoload is concurrently referenced, the same can be said. Assuming no errors, only one thread will succeed, and the rest wait. There will be no context switching in the middle of an autoload that will result in a NameError
in other threads waiting for that constant.
Now, I'd like to have a discussion about those guarantees with fibers.
In the case of manually managed fibers, users can enter a deadlock by hand:
# This produces a deadlock.
File.write('/tmp/bar.rb', <<~RUBY)
Fiber.yield
RUBY
Fiber.new { require '/tmp/bar.rb' }.resume
Fiber.new { require '/tmp/bar.rb' }.resume
If this is expected, I guess users should be told this is a possibility in the API dosc? Because from a user perspective, you don't really have elements to anticipate a deadlock there if the docs don't warn you.
A similar deadlock can be triggered with an autoload
instead of a require
:
# This produces a deadlock.
File.write('/tmp/bar.rb', <<~RUBY)
Fiber.yield
Bar = 1
RUBY
autoload :Bar, '/tmp/bar.rb'
Fiber.new { Bar }.resume
Fiber.new { Bar }.resume
A different matter is fibers managed by fiber schedulers. I have not been able to enter a deadlock with the fiber schedulers I have tried, but from the point of view of the user, doing something like I/O or sleeping at the top-level is not unlike that manual Fiber.yield
above. The contract for fiber schedulers is mostly an interface, but it does not address this, at least in an explicit way. Do fiber schedulers guarantee anything about this with the current contract?
I'd be glad to volunteer docs with the conclusions of this thread.
Updated by fxn (Xavier Noria) 11 months ago ยท Edited
Followup.
We've exchanged impressions with @ioquatix (Samuel Williams).
Regarding the first example, it would be nice that Ruby detects the situation are raises a controlled user-facing error message. Currently, the user sees a deadlock, but there is no locks in the code. That deadlock could be argued to be a leak from the implementation, not informative enough for the user.
Regarding concurrency and require
/autoload
, nowadays they have no documentation. Users need to know what happens when two fibers are concurrently requiring the same file, or autoloading the same constant. For that, we'll do a pass over the code and will improve the docs in this regard.