Bug #18464
openRUBY_INTERNAL_EVENT_NEWOBJ tracepoint causes an interpreter crash when combined with Ractors
Description
When a Ractor is created whilst a tracepoint for RUBY_INTERNAL_EVENT_NEWOBJ
is active (registered with rb_tracepoint_new
/rb_tracepoint_enabled
), the interpreter crashes with a null pointer dereference with the following backtrace:
[BUG] Segmentation fault at 0x0000000000000000
ruby 3.1.0p0 (2021-12-25 revision fb4df44d16) [x86_64-darwin20]
...
-- C level backtrace information -------------------------------------------
/Users/ktsanaktsidis/Code/zendesk/ruby/ruby(rb_print_backtrace+0xf) [0x10a15fadd] vm_dump.c:759
/Users/ktsanaktsidis/Code/zendesk/ruby/ruby(rb_vm_bugreport) vm_dump.c:1045
/Users/ktsanaktsidis/Code/zendesk/ruby/ruby(rb_vm_bugreport) (null):0
/Users/ktsanaktsidis/Code/zendesk/ruby/ruby(bug_report_end+0x0) [0x109f96b81] error.c:820
/Users/ktsanaktsidis/Code/zendesk/ruby/ruby(rb_bug_for_fatal_signal) error.c:820
/Users/ktsanaktsidis/Code/zendesk/ruby/ruby(sigsegv+0x52) [0x10a0be3a2] signal.c:964
/usr/lib/system/libsystem_platform.dylib(_sigtramp+0x1d) [0x7fff20934d7d]
/Users/ktsanaktsidis/Code/zendesk/ruby/ruby(gc_event_hook_body+0x4) [0x109fb9d21] gc.c:2214
/Users/ktsanaktsidis/Code/zendesk/ruby/ruby(newobj_slowpath) gc.c:2486
/Users/ktsanaktsidis/Code/zendesk/ruby/ruby(newobj_slowpath_wb_unprotected) gc.c:2507
/Users/ktsanaktsidis/Code/zendesk/ruby/ruby(newobj_fill+0x0) [0x109fac92e] gc.c:2543
/Users/ktsanaktsidis/Code/zendesk/ruby/ruby(newobj_of0) gc.c:2553
/Users/ktsanaktsidis/Code/zendesk/ruby/ruby(newobj_of) gc.c:2552
/Users/ktsanaktsidis/Code/zendesk/ruby/ruby(rb_wb_unprotected_newobj_of) gc.c:2567
/Users/ktsanaktsidis/Code/zendesk/ruby/ruby(io_alloc+0x12) [0x109fd341c] io.c:1047
/Users/ktsanaktsidis/Code/zendesk/ruby/ruby(prep_io) io.c:8483
/Users/ktsanaktsidis/Code/zendesk/ruby/ruby(prep_stdio) io.c:8514
/Users/ktsanaktsidis/Code/zendesk/ruby/ruby(rb_io_prep_stdin) io.c:8532
/Users/ktsanaktsidis/Code/zendesk/ruby/ruby(thread_start_func_2+0xf7) [0x10a1058a7] thread.c:802
/Users/ktsanaktsidis/Code/zendesk/ruby/ruby(rb_native_cond_initialize+0x0) [0x10a1055fb] ./thread_pthread.c:1047
/Users/ktsanaktsidis/Code/zendesk/ruby/ruby(register_cached_thread_and_wait) ./thread_pthread.c:1099
/Users/ktsanaktsidis/Code/zendesk/ruby/ruby(thread_start_func_1) ./thread_pthread.c:1054
/usr/lib/system/libsystem_pthread.dylib(_pthread_start+0xe0) [0x7fff208ef8fc]
(full output is attached).
This seems to be because the new Ractor sets up stdio objects (rb_io_prep_stdin
et. al.), which in turn allocate Ruby objects, before rb_ec_initialize_vm_stack
is called to set up the initial stack frame.
I've attached a patch which works around this by not firing GC event hooks if there is no control frame on the execution context. The patch also includes a test which reproduces the issue using the objspace
extension; creating a Ractor within an ObjectSpace.trace_object_allocations
block is enough to trigger the crash. The patch seems to fix things, but if you folk prefer I can also try swapping around the order of prep_stdio
and rb_ec_initialize_vm_stack
.
Files
Updated by nobu (Nobuyoshi Nakada) 5 months ago
- Assignee set to ko1 (Koichi Sasada)
- Status changed from Open to Assigned
Updated by kjtsanaktsidis (KJ Tsanaktsidis) 14 days ago
Just checked, this is still an issue with 3.2.0-preview1. Is there any feedback on the patch I posted? Any other way you would suggest going about a solution? Thanks!