Bug #21090
openSEGV from require in Thread in Ractor
Description
When ruby calls 'require' in Thread in non-main Ractor, it can cause SEGV sometimes.
$ ruby -e '1000.times { Ractor.new { th = Thread.new { require "rbconfig" }; Thread.pass }.take }' > segv.log 2>&1
Segmentation fault (core dumped)
segv.log
is too large to paste in this description, so I attached as a file.
Files
Updated by luke-gru (Luke Gruber) 9 days ago
· Edited
I couldn't track down the exact cause of the issue, but I do have a PR coming that solves it.
Edit: PR here https://github.com/ruby/ruby/pull/12646
If you wanted to call th.join
in your script, this will still fail. You need another patch that is not yet merged, this one: https://github.com/ruby/ruby/pull/12520
Updated by luke-gru (Luke Gruber) 3 days ago
· Edited
I managed to track down the issues, as there was more than 1.
I'll copy my PR message below:
stack struct memory was receiving weird values with lots of ractors when
calling Ractor#require in a thread that wasn't joined.
This was due to the script exiting before the ractor channel yields to the taker.
In that case, the taker can receive a FATAL interrupt and jump to its end, but
the channel might still try to use that stack object from the taker when its thread
starts. For this reason, we allocate on the heap and free after the require ends.
There were 2 more issues as well:
We must close the incoming port of the taker before raising, or another
ractor might try to yield to us. This would use the basket object on the
stack of the taker, but it will be corrupted after a raise.
There was an issue with ractor barriers during GC, any thread calling
ractor_sched_barrier_join_wait_locked must not lock TH_SCHED(th) of
calling thread. Otherwise this could result in a deadlock if another
thread tries to lock our sched.