Bug #18967
closedSegmentation fault in stackprof with Ruby 2.7.6
Added by RubyBugs (A Nonymous) over 1 year ago. Updated over 1 year ago.
Description
Ruby 2.7.6 appears to have broken the stackprof gem, crashing on every run with a segmentation fault.
Please see the following issues reported on stackprof:
https://github.com/tmm1/stackprof/issues/185
https://github.com/tmm1/stackprof/issues/182
Files
stackprof_crash_ruby_2_7_6.txt.bz2 (69.1 KB) stackprof_crash_ruby_2_7_6.txt.bz2 | RubyBugs (A Nonymous), 08/21/2022 06:53 PM |
Updated by mame (Yusuke Endoh) over 1 year ago
- Status changed from Open to Feedback
Thank you for your report. At least we need the full trace, especially. "C level backtrace information" section. Can you provide it?
Updated by RubyBugs (A Nonymous) over 1 year ago
mame (Yusuke Endoh) wrote in #note-1:
Thank you for your report. At least we need the full trace, especially. "C level backtrace information" section. Can you provide it?
Hello! Thank you. I have captured the full trace, and attached it to this reply, compressed with bzip2. Please let me know if you are able to view it?
Updated by RubyBugs (A Nonymous) over 1 year ago
Per @Eregon (Benoit Daloze) on https://github.com/tmm1/stackprof/issues/182#issuecomment-1221274946 -
It appears that the stackprof gem may have been segfaulting in CI for some time: https://github.com/tmm1/stackprof/actions/workflows/ci.yml
Updated by mame (Yusuke Endoh) over 1 year ago
- Status changed from Feedback to Open
Thank you for providing the full stack trace. In conclusion, I couldn't find the cause, sorry.
Ruby 2.7 is under the security maintenance phase (maybe EOL next March), so I'd recommend to use Ruby 3.0 or later as soon as possible.
Maybe relevant stack trace fragument:
/ruby/bin/../lib/libruby.so.2.7(sigsegv+0x4b) [0x7faf1e64c0cb] signal.c:946
/lib/x86_64-linux-gnu/libc.so.6(0x7faf1e0f0f10) [0x7faf1e0f0f10]
/ruby/bin/../lib/libruby.so.2.7(imemo_type+0x0) [0x7faf1e6a9b39] vm_insnhelper.c:588
/ruby/bin/../lib/libruby.so.2.7(check_method_entry) vm_insnhelper.c:594
/ruby/bin/../lib/libruby.so.2.7(rb_vm_frame_method_entry) vm_insnhelper.c:618
/ruby/bin/../lib/libruby.so.2.7(rb_profile_frames+0x78) [0x7faf1e6c8308] vm_backtrace.c:1323
/usr/packages/ruby-2.7.6/gems/stackprof-0.2.20/lib/stackprof/stackprof.so(stackprof_buffer_sample+0x68) [0x7faf0edce678] stackprof.c:615
/usr/packages/ruby-2.7.6/gems/stackprof-0.2.20/lib/stackprof/stackprof.so(stackprof_buffer_sample) (null):0
/usr/packages/ruby-2.7.6/gems/stackprof-0.2.20/lib/stackprof/stackprof.so(stackprof_signal_handler+0xcd) [0x7faf0edce8ed] stackprof.c:740
/lib/x86_64-linux-gnu/libc.so.6(0x7faf1e0f0f10) [0x7faf1e0f0f10]
/lib/x86_64-linux-gnu/libpthread.so.0(pthread_cond_timedwait+0x289) [0x7faf1dc83fb9]
Indeed the segfault occurs in the hook of stackprof. I looked at the code around this, but couldn't find any significant difference between 2.7 and 3.0.
I think there may be a garbage VALUE in the VM stack, but I don't recall such a problem. Does anyone have any ideas?
Updated by Eregon (Benoit Daloze) over 1 year ago
- Status changed from Open to Third Party's Issue
I found the bug: https://github.com/tmm1/stackprof/pull/180/files#r951294711
Updated by Eregon (Benoit Daloze) over 1 year ago
i.e., rb_profile_frames is called at a random place and that's not supported on < 3.0.
(TBH even on >= 3.0 I wonder if it's truly supported, it seems pretty dangerous to call rb_profile_frames()/anything not-async-signal-safe from a signal handler)
Updated by ivoanjo (Ivo Anjo) over 1 year ago
As pointed by @Eregon (Benoit Daloze) I planned experimenting with this and raising it at some point -- the change in https://github.com/ruby/ruby/commit/0e276dc458f94d9d79a0f7c7669bde84abe80f21 did reorder things as far as the C source goes, but as far as I see it there really doesn't seem to be anything guaranteeing that the compiler won't reorder the write to ec->cfp with the actual initialization of the structure.
So... yeah this doesn't seem particularly safe at this moment.
(But it would be great if rb_profile_frames could indeed be made async-safe!)
Updated by byroot (Jean Boussier) over 1 year ago
The fix is here: https://github.com/tmm1/stackprof/pull/186 I'll try to get a release soon.
@RubyBugs (A Nonymous) in the meantime I suggest sticking to 0.2.19
.
Updated by tenderlovemaking (Aaron Patterson) over 1 year ago
byroot (Jean Boussier) wrote in #note-8:
The fix is here: https://github.com/tmm1/stackprof/pull/186 I'll try to get a release soon.
@RubyBugs (A Nonymous) in the meantime I suggest sticking to
0.2.19
.
I merged it and shipped 0.2.21. Thanks!