Thread#join can break with fiber scheduler unblock fails or blocks.
In addition to https://bugs.ruby-lang.org/issues/17666 we found several more cases that need to be addressed.
Fix potential hang when joining threads.
If the thread termination invokes user code after
THREAD_KILLED, and the user unblock function causes that
become something else (e.g.
THREAD_RUNNING), threads waiting in
thread_join_sleep will hang forever. We move the unblock function call
to before the thread status is updated, and allow threads to join as soon
th->value becomes defined.
Wake up join list within thread EC context. (#4471)
rb_fiber_scheduler_unblock raises an exception, it can result in a
rb_threadptr_join_list_wakeup is not within a valid EC. This
rb_threadptr_join_list_wakeup into the thread's top level EC
which initially caused an infinite loop because on exception will retry. We
explicitly remove items from the thread's join list to avoid this situation.
These are already fixed on master branch. Here is a PR for backport: https://github.com/ruby/ruby/pull/4686
Updated by nagachika (Tomoyuki Chikanaga) about 1 month ago
I create the backport patch including 050a89543952a2c9e7c9bc938f4fdb538f6c9278 and 13f8521c630a15c87398dee0763e95f59c032a94 and push to my branch. See https://github.com/ruby/ruby/pull/4686/files.
But on the branch, make btest hangs on the bootstraptest/test_ractor.rb.
% make btest 2021-08-14 16:57:56 +0900 Driver is ruby 3.0.3p123 (2021-08-08 revision 3922394c85) [x86_64-darwin19] Target is ruby 3.0.3p124 (2021-08-14 revision 720d9c0803) [x86_64-darwin19] test_attr.rb PASS 2 test_autoload.rb PASS 8 test_block.rb PASS 58 test_class.rb PASS 48 test_env.rb PASS 2 test_eval.rb PASS 37 test_exception.rb PASS 34 test_fiber.rb PASS 5 test_finalizer.rb PASS 1 test_flip.rb PASS 1 test_flow.rb PASS 62 test_fork.rb PASS 4 test_gc.rb PASS 2 test_insns.rb PASS 383 test_io.rb PASS 9 test_jump.rb PASS 29 test_literal.rb PASS 156 test_literal_suffix.rb PASS 48 test_load.rb PASS 2 test_marshal.rb PASS 1 test_massign.rb PASS 34 test_method.rb PASS 223 test_objectspace.rb PASS 6 test_proc.rb PASS 37 test_ractor.rb \ ↑ hangs up here
Samuel, would you review my backport candidate branch if you don't mind?