Misc #20232
openDocument Kernel#require and Module#autoload concurrency guarantees
Description
I'd like to document Kernel#require
and Module#autoload
concurrency guarantees.
In the case of multiple threads loading the same file concurrently, Kernel#require
will succeed in just one of them and the rest will wait and return false. If a constant that has an autoload is concurrently referenced, the same can be said. Assuming no errors, only one thread will succeed, and the rest wait. There will be no context switching in the middle of an autoload that will result in a NameError
in other threads waiting for that constant.
Now, I'd like to have a discussion about those guarantees with fibers.
In the case of manually managed fibers, users can enter a deadlock by hand:
# This produces a deadlock.
File.write('/tmp/bar.rb', <<~RUBY)
Fiber.yield
RUBY
Fiber.new { require '/tmp/bar.rb' }.resume
Fiber.new { require '/tmp/bar.rb' }.resume
If this is expected, I guess users should be told this is a possibility in the API dosc? Because from a user perspective, you don't really have elements to anticipate a deadlock there if the docs don't warn you.
A similar deadlock can be triggered with an autoload
instead of a require
:
# This produces a deadlock.
File.write('/tmp/bar.rb', <<~RUBY)
Fiber.yield
Bar = 1
RUBY
autoload :Bar, '/tmp/bar.rb'
Fiber.new { Bar }.resume
Fiber.new { Bar }.resume
A different matter is fibers managed by fiber schedulers. I have not been able to enter a deadlock with the fiber schedulers I have tried, but from the point of view of the user, doing something like I/O or sleeping at the top-level is not unlike that manual Fiber.yield
above. The contract for fiber schedulers is mostly an interface, but it does not address this, at least in an explicit way. Do fiber schedulers guarantee anything about this with the current contract?
I'd be glad to volunteer docs with the conclusions of this thread.