Project

General

Profile

Actions

Bug #18464

open

RUBY_INTERNAL_EVENT_NEWOBJ tracepoint causes an interpreter crash when combined with Ractors

Added by kjtsanaktsidis (KJ Tsanaktsidis) 11 months ago. Updated 18 days ago.

Status:
Assigned
Priority:
Normal
Target version:
-
ruby -v:
ruby 3.1.0p0 (2021-12-25 revision fb4df44d16) [x86_64-darwin20]
[ruby-core:107005]

Description

When a Ractor is created whilst a tracepoint for RUBY_INTERNAL_EVENT_NEWOBJ is active (registered with rb_tracepoint_new/rb_tracepoint_enabled), the interpreter crashes with a null pointer dereference with the following backtrace:

[BUG] Segmentation fault at 0x0000000000000000
ruby 3.1.0p0 (2021-12-25 revision fb4df44d16) [x86_64-darwin20]

...

-- C level backtrace information -------------------------------------------
/Users/ktsanaktsidis/Code/zendesk/ruby/ruby(rb_print_backtrace+0xf) [0x10a15fadd] vm_dump.c:759
/Users/ktsanaktsidis/Code/zendesk/ruby/ruby(rb_vm_bugreport) vm_dump.c:1045
/Users/ktsanaktsidis/Code/zendesk/ruby/ruby(rb_vm_bugreport) (null):0
/Users/ktsanaktsidis/Code/zendesk/ruby/ruby(bug_report_end+0x0) [0x109f96b81] error.c:820
/Users/ktsanaktsidis/Code/zendesk/ruby/ruby(rb_bug_for_fatal_signal) error.c:820
/Users/ktsanaktsidis/Code/zendesk/ruby/ruby(sigsegv+0x52) [0x10a0be3a2] signal.c:964
/usr/lib/system/libsystem_platform.dylib(_sigtramp+0x1d) [0x7fff20934d7d]
/Users/ktsanaktsidis/Code/zendesk/ruby/ruby(gc_event_hook_body+0x4) [0x109fb9d21] gc.c:2214
/Users/ktsanaktsidis/Code/zendesk/ruby/ruby(newobj_slowpath) gc.c:2486
/Users/ktsanaktsidis/Code/zendesk/ruby/ruby(newobj_slowpath_wb_unprotected) gc.c:2507
/Users/ktsanaktsidis/Code/zendesk/ruby/ruby(newobj_fill+0x0) [0x109fac92e] gc.c:2543
/Users/ktsanaktsidis/Code/zendesk/ruby/ruby(newobj_of0) gc.c:2553
/Users/ktsanaktsidis/Code/zendesk/ruby/ruby(newobj_of) gc.c:2552
/Users/ktsanaktsidis/Code/zendesk/ruby/ruby(rb_wb_unprotected_newobj_of) gc.c:2567
/Users/ktsanaktsidis/Code/zendesk/ruby/ruby(io_alloc+0x12) [0x109fd341c] io.c:1047
/Users/ktsanaktsidis/Code/zendesk/ruby/ruby(prep_io) io.c:8483
/Users/ktsanaktsidis/Code/zendesk/ruby/ruby(prep_stdio) io.c:8514
/Users/ktsanaktsidis/Code/zendesk/ruby/ruby(rb_io_prep_stdin) io.c:8532
/Users/ktsanaktsidis/Code/zendesk/ruby/ruby(thread_start_func_2+0xf7) [0x10a1058a7] thread.c:802
/Users/ktsanaktsidis/Code/zendesk/ruby/ruby(rb_native_cond_initialize+0x0) [0x10a1055fb] ./thread_pthread.c:1047
/Users/ktsanaktsidis/Code/zendesk/ruby/ruby(register_cached_thread_and_wait) ./thread_pthread.c:1099
/Users/ktsanaktsidis/Code/zendesk/ruby/ruby(thread_start_func_1) ./thread_pthread.c:1054
/usr/lib/system/libsystem_pthread.dylib(_pthread_start+0xe0) [0x7fff208ef8fc]

(full output is attached).

This seems to be because the new Ractor sets up stdio objects (rb_io_prep_stdin et. al.), which in turn allocate Ruby objects, before rb_ec_initialize_vm_stack is called to set up the initial stack frame.

I've attached a patch which works around this by not firing GC event hooks if there is no control frame on the execution context. The patch also includes a test which reproduces the issue using the objspace extension; creating a Ractor within an ObjectSpace.trace_object_allocations block is enough to trigger the crash. The patch seems to fix things, but if you folk prefer I can also try swapping around the order of prep_stdio and rb_ec_initialize_vm_stack.


Files

0001-Fix-interpreter-crash-caused-by-RUBY_INTERNAL_EVENT_.patch (1.91 KB) 0001-Fix-interpreter-crash-caused-by-RUBY_INTERNAL_EVENT_.patch kjtsanaktsidis (KJ Tsanaktsidis), 01/08/2022 04:34 AM
crash.log (26.1 KB) crash.log kjtsanaktsidis (KJ Tsanaktsidis), 01/08/2022 04:35 AM
ruby_2022-01-08-151326_8927-ktsanaktsidis.crash (18.8 KB) ruby_2022-01-08-151326_8927-ktsanaktsidis.crash kjtsanaktsidis (KJ Tsanaktsidis), 01/08/2022 04:37 AM

Updated by nobu (Nobuyoshi Nakada) 11 months ago

  • Status changed from Open to Assigned
  • Assignee set to ko1 (Koichi Sasada)

Updated by kjtsanaktsidis (KJ Tsanaktsidis) 7 months ago

Just checked, this is still an issue with 3.2.0-preview1. Is there any feedback on the patch I posted? Any other way you would suggest going about a solution? Thanks!

Updated by kjtsanaktsidis (KJ Tsanaktsidis) 6 months ago

I opened a PR with this patch. Happy to try fixing it a different way but this at least stops the crash. https://github.com/ruby/ruby/pull/5990

Updated by ivoanjo (Ivo Anjo) 20 days ago

If it helps, here's a Linux-based backtrace:

-- C level backtrace information -------------------------------------------
/usr/local/lib/libruby.so.3.1(rb_print_backtrace+0x11) [0x7f75e6678aa8] vm_dump.c:759
/usr/local/lib/libruby.so.3.1(rb_vm_bugreport) vm_dump.c:1045
/usr/local/lib/libruby.so.3.1(rb_bug_for_fatal_signal+0xf0) [0x7f75e6477750] error.c:821
/usr/local/lib/libruby.so.3.1(sigsegv+0x49) [0x7f75e65ced19] signal.c:964
/lib/x86_64-linux-gnu/libpthread.so.0(__restore_rt+0x0) [0x7f75e636e140]
/usr/local/lib/libruby.so.3.1(gc_event_hook_body+0x1b) [0x7f75e649010b] gc.c:2217
/usr/local/lib/libruby.so.3.1(gc_enter+0x1f) [0x7f75e64a495f] gc.c:9194
/usr/local/lib/libruby.so.3.1(gc_enter) gc.c:9165
/usr/local/lib/libruby.so.3.1(gc_sweep_continue) gc.c:5743
/usr/local/lib/libruby.so.3.1(heap_prepare) gc.c:2193
/usr/local/lib/libruby.so.3.1(heap_next_freepage) gc.c:2388
/usr/local/lib/libruby.so.3.1(ractor_cache_slots) gc.c:2424
/usr/local/lib/libruby.so.3.1(newobj_slowpath) gc.c:2484
/usr/local/lib/libruby.so.3.1(newobj_slowpath_wb_unprotected) gc.c:2510
/usr/local/lib/libruby.so.3.1(newobj_fill+0x0) [0x7f75e64a4cf9] gc.c:2546
/usr/local/lib/libruby.so.3.1(newobj_of) gc.c:2556
/usr/local/lib/libruby.so.3.1(rb_wb_unprotected_newobj_of) gc.c:2570
/usr/local/lib/libruby.so.3.1(io_alloc+0x5) [0x7f75e64ccbca] io.c:1047
/usr/local/lib/libruby.so.3.1(prep_io) io.c:8479
/usr/local/lib/libruby.so.3.1(prep_stdio) io.c:8510
/usr/local/lib/libruby.so.3.1(rb_io_prep_stdin) io.c:8528
/usr/local/lib/libruby.so.3.1(thread_start_func_2+0x165) [0x7f75e6619965] thread.c:802
/usr/local/lib/libruby.so.3.1(register_cached_thread_and_wait+0x0) [0x7f75e661a6f9] thread_pthread.c:1047
/usr/local/lib/libruby.so.3.1(thread_start_func_1) thread_pthread.c:1054
/lib/x86_64-linux-gnu/libpthread.so.0(0x8ea7) [0x7f75e6362ea7]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x3f) [0x7f75e607fdef]

Updated by ivoanjo (Ivo Anjo) 18 days ago

Interestingly, my crash happened on RUBY_INTERNAL_EVENT_GC_ENTER (you can see my stack includes an attempt to garbage collect) but I believe the fix would work for this situation as well.

Actions

Also available in: Atom PDF

Like1
Like0Like0Like0Like0Like0