Bug #19234
closed[3.2.0dev] YJIT code GC can lead to crashes
Description
Filing this bug here in case some people may have observed it too and may have more information, and also to keep track of it for the upcoming 3.2.0 release.
After changing some settings on our CI to make sure YJIT's code_gc
would trigger, we discovered that it sometimes cause crashes.
The crash can take many different form (e.g. [BUG] Segmentation fault at 0x00005604a8e78006
or [BUG] Illegal instruction at 0x0000aaaacc0ce4c0
), and happens on both x86
and arm64
.
It however happens very consistently on our CI, but only after running for 15 to 20 minutes and we haven't been able to reduce it to a local reproduction script.
When it happens however the backtrace isn't really helpful:
-- C level backtrace information -------------------------------------------
/usr/local/ruby/bin/real-ruby(rb_print_backtrace+0x11) [0x5604a8a6df7d] vm_dump.c:770
/usr/local/ruby/bin/real-ruby(rb_vm_bugreport) vm_dump.c:1065
/usr/local/ruby/bin/real-ruby(rb_bug_for_fatal_signal+0xee) [0x5604a8ba927e] error.c:813
/usr/local/ruby/bin/real-ruby(sigsegv+0x4d) [0x5604a89c3ded] signal.c:964
/lib/x86_64-linux-gnu/libpthread.so.0(__restore_rt+0x0) [0x7fb5b4285420]
[0x5604acd31079]
Like regular GC bugs, it is likely that the code GC need to trigger at a very specific place for the bug to happen. Our attempts at triggering it manually with RubyVM::YJIT.code_gc
or to set the executable memory very low to trigger it more often didn't allow for a simpler reproduction.
Both @k0kubun (Takashi Kokubun) and @alanwu (Alan Wu) are investigating it right now.