Bug #17052
closedRuby with LTO enabled has issues with SIGSEGV handler
Added by vo.x (Vit Ondruch) over 4 years ago. Updated about 3 years ago.
Description
Fedora aims to enable link time optimization (LTO) of packages in next release. The specific changes in configuration options are available here 1. Since that time, I observe following errors 2 at least on {aarch64,ppc64le} (and possibly also other architectures):
1) Failure:
TestBugReporter#test_bug_reporter_add [/builddir/build/BUILD/ruby-2.7.1/test/-ext-/bug_reporter/test_bug_reporter.rb:22]:
pid 32395 killed by SIGSEGV (signal 11) (core dumped)
| -:1: [BUG] Segmentation fault at 0x000003e800007e8b
| ruby 2.7.1p83 (2020-03-31 revision a0c7c23c9c) [powerpc64le-linux]
|
| -- Control frame information -----------------------------------------------
| c:0003 p:---- s:0012 e:000011 CFUNC :kill
| c:0002 p:0021 s:0006 e:000005 EVAL -:1 [FINISH]
| c:0001 p:0000 s:0003 E:000f80 (none) [FINISH]
|
| -- Ruby level backtrace information ----------------------------------------
| -:1:in `<main>'
| -:1:in `kill'
|
| -- C level backtrace information -------------------------------------------
.
1. [2/2] Assertion for "stderr"
| Expected /Sample bug reporter: 12345/
| to match
| "-- Control frame information -----------------------------------------------\n"+
| "c:0003 p:---- s:0012 e:000011 CFUNC :kill\n"+
| "c:0002 p:0021 s:0006 e:000005 EVAL -:1 [FINISH]\n"+
| "c:0001 p:0000 s:0003 E:000f80 (none) [FINISH]\n\n"+
| "-- Ruby level backtrace information ----------------------------------------\n"+
| "-:1:in `<main>'\n"+
| "-:1:in `kill'\n\n"+
| "-- C level backtrace information -------------------------------------------\n"
| after 4 patterns with 120 characters.
2) Failure:
TestRubyOptions#test_segv_loaded_features [/builddir/build/BUILD/ruby-2.7.1/test/ruby/test_rubyoptions.rb:735]:
pid 38444 killed by SIGSEGV (signal 11) (core dumped)
| -e:1: [BUG] Segmentation fault at 0x000003e80000962c
| ruby 2.7.1p83 (2020-03-31 revision a0c7c23c9c) [powerpc64le-linux]
|
| -- Control frame information -----------------------------------------------
| c:0003 p:---- s:0012 e:000011 CFUNC :kill
| c:0002 p:0016 s:0006 e:000005 BLOCK -e:1 [FINISH]
| c:0001 p:0000 s:0003 E:002460 (none) [FINISH]
|
| -- Ruby level backtrace information ----------------------------------------
| -e:1:in `block in <main>'
| -e:1:in `kill'
|
| -- C level backtrace information -------------------------------------------
.
1. [2/2] Assertion for "stderr"
| <""> expected but was
| <"-- C level backtrace information -------------------------------------------\n">.
3) Failure:
TestRubyOptions#test_segv_setproctitle [/builddir/build/BUILD/ruby-2.7.1/test/ruby/test_rubyoptions.rb:749]:
pid 38451 killed by SIGSEGV (signal 11) (core dumped)
| -e:1: [BUG] Segmentation fault at 0x000003e800009633
| ruby 2.7.1p83 (2020-03-31 revision a0c7c23c9c) [powerpc64le-linux]
|
| -- Control frame information -----------------------------------------------
| c:0003 p:---- s:0012 e:000011 CFUNC :kill
| c:0002 p:0029 s:0006 e:000005 EVAL -e:1 [FINISH]
| c:0001 p:0000 s:0003 E:000480 (none) [FINISH]
|
| -- Ruby level backtrace information ----------------------------------------
| -e:1:in `<main>'
| -e:1:in `kill'
|
| -- C level backtrace information -------------------------------------------
.
1. [2/2] Assertion for "stderr"
| <""> expected but was
| <"-- C level backtrace information -------------------------------------------\n">.
4) Failure:
TestRubyOptions#test_segv_test [/builddir/build/BUILD/ruby-2.7.1/test/ruby/test_rubyoptions.rb:729]:
pid 38460 killed by SIGSEGV (signal 11) (core dumped)
| -e:1: [BUG] Segmentation fault at 0x000003e80000963c
| ruby 2.7.1p83 (2020-03-31 revision a0c7c23c9c) [powerpc64le-linux]
|
| -- Control frame information -----------------------------------------------
| c:0003 p:---- s:0012 e:000011 CFUNC :kill
| c:0002 p:0015 s:0006 e:000005 EVAL -e:1 [FINISH]
| c:0001 p:0000 s:0003 E:0006a0 (none) [FINISH]
|
| -- Ruby level backtrace information ----------------------------------------
| -e:1:in `<main>'
| -e:1:in `kill'
|
| -- C level backtrace information -------------------------------------------
.
1. [2/2] Assertion for "stderr"
| <""> expected but was
| <"-- C level backtrace information -------------------------------------------\n">.
Finished tests in 840.600443s, 25.0047 tests/s, 3238.9681 assertions/s.
21019 tests, 2722678 assertions, 4 failures, 0 errors, 70 skips
ruby -v: ruby 2.7.1p83 (2020-03-31 revision a0c7c23c9c) [powerpc64le-linux]
When I raised the issue on fedora-devel ML 3, there was suggestion that it might happen when signal handler modifies any global variable. Now I am not sure if that is the case. Can somebody confirm? Or investigate/fix this, please?
Files
ruby-addr2line-DW_FORM_ref_addr.patch (651 Bytes) ruby-addr2line-DW_FORM_ref_addr.patch | Correctly interpret the size of `DW_FORM_ref_addr` | xtkoba (Tee KOBAYASHI), 08/12/2021 09:56 PM | |
ruby-addr2line-read_abstract_origin.patch (1.37 KB) ruby-addr2line-read_abstract_origin.patch | Restrict forms in `read_abstract_origin` | xtkoba (Tee KOBAYASHI), 08/13/2021 01:40 PM | |
ruby-addr2line-abbrev_table.patch (1.49 KB) ruby-addr2line-abbrev_table.patch | addr2line.c: Correct `abbrev_table` entries | xtkoba (Tee KOBAYASHI), 08/17/2021 07:42 AM | |
ruby-addr2line-abbrev_table-r1.patch (2.81 KB) ruby-addr2line-abbrev_table-r1.patch | Correct `abbrev_table` entries (revised) | xtkoba (Tee KOBAYASHI), 08/17/2021 10:01 AM | |
ruby-addr2line-debug_info_read-debug-code-0001.patch (1.11 KB) ruby-addr2line-debug_info_read-debug-code-0001.patch | Debug code for `debug_info_read` | xtkoba (Tee KOBAYASHI), 08/19/2021 12:51 PM |
Updated by vo.x (Vit Ondruch) over 4 years ago
- Subject changed from Ruby with LTO enabled on {aarch64,ppc64le} architectures. to Ruby with LTO enabled has issues with SIGSEGV handler
Actually, the platform might not matter. It might fail just randomly, because now I have caught the same issue also on x86_64.
Updated by shyouhei (Shyouhei Urabe) over 4 years ago
- Status changed from Open to Third Party's Issue
Yes I can reproduce this on my machine.
% LC_ALL=C gdb --args ./miniruby -e'Process.kill("SIGSEGV",$$)'
GNU gdb (Ubuntu 8.2-0ubuntu1~18.04) 8.2
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ./miniruby...done.
(gdb) run
Starting program: ./miniruby -eProcess.kill\(\"SIGSEGV\",\$\$\)
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Program received signal SIGSEGV, Segmentation fault.
0x00007ffff7a23237 in kill () at ../sysdeps/unix/syscall-template.S:78
78 ../sysdeps/unix/syscall-template.S: No such file or directory.
(gdb) bt
#0 0x00007ffff7a23237 in kill () at ../sysdeps/unix/syscall-template.S:78
#1 0x000055555571ec9b in rb_f_kill (argc=2, argv=0x7ffff7ecb048) at signal.c:480
#2 0x0000555555789c44 in vm_call_cfunc_with_frame (ec=0x555555add590, reg_cfp=0x7ffff7fcafa0, calling=<optimized out>, cd=<optimized out>) at vm_insnhelper.c:2601
#3 0x000055555578d4d5 in vm_sendish (ec=0x555555add590, reg_cfp=0x7ffff7fcafa0, cd=0x555555bfaf30, block_handler=<optimized out>, method_explorer=<optimized out>) at vm_callinfo.h:337
#4 0x000055555578ef98 in vm_exec_core (ec=0x555555add590, initial=11) at insns.def:799
#5 0x00005555557a4ee4 in rb_vm_exec (ec=0x555555add590, mjit_enable_p=1) at vm.c:1953
#6 0x00005555555ff1d0 in rb_ec_exec_node (ec=ec@entry=0x555555add590, n=n@entry=0x555555ae9958) at eval.c:296
#7 0x0000555555603533 in ruby_run_node (n=0x555555ae9958) at eval.c:354
#8 0x000055555557ee0f in main (argc=<optimized out>, argv=<optimized out>) at main.c:50
(gdb) c
Continuing.
-e:1: [BUG] Segmentation fault at 0x0000286b000022c1
ruby 2.8.0dev (2020-08-20T02:48:16Z flto 71753b7f6d) [x86_64-linux]
-- Control frame information -----------------------------------------------
c:0003 p:---- s:0012 e:000011 CFUNC :kill
c:0002 p:0015 s:0006 e:000005 EVAL -e:1 [FINISH]
c:0001 p:0000 s:0003 E:000a10 (none) [FINISH]
-- Ruby level backtrace information ----------------------------------------
-e:1:in `<main>'
-e:1:in `kill'
-- Machine register context ------------------------------------------------
RIP: 0x00007ffff7a23237 RBP: 0x0000000000000002 RSP: 0x00007fffffffc2b8
RAX: 0x0000000000000000 RBX: 0x0000000000000001 RCX: 0x00007ffff7a23237
RDX: 0x0000555555adca01 RDI: 0x00000000000022c1 RSI: 0x000000000000000b
R8: 0x0000555555b2a610 R9: 0x0000555555bed890 R10: 0x0000555555bed890
R11: 0x0000000000000206 R12: 0x00007ffff7ecb048 R13: 0x00000000000022c1
R14: 0x00000000000022c1 R15: 0x000000000000000b EFL: 0x0000000000000206
-- C level backtrace information -------------------------------------------
Program received signal SIGSEGV, Segmentation fault.
0x00005555557a9a90 in uleb128 (p=0x555555bd5428) at addr2line.c:1274
1274 uint64_t abbrev_number = uleb128(&reader->p);
(gdb) bt
#0 0x00005555557a9a90 in uleb128 (p=0x555555bd5428) at addr2line.c:1274
#1 di_read_die (reader=0x555555bd53f0, die=0x555555bd5310) at addr2line.c:1274
#2 0x00005555557bb65d in read_abstract_origin (line=0x555555bd5380, abstract_origin=<optimized out>, reader=0x555555bd53f0) at addr2line.c:1486
#3 debug_info_read (traces=<optimized out>, offset=<optimized out>, lines=<optimized out>, num_traces=<optimized out>, reader=<optimized out>) at addr2line.c:1550
#4 fill_lines (num_traces=num_traces@entry=16, check_debuglink=check_debuglink@entry=1, objp=objp@entry=0x555555bd5d18, lines=lines@entry=0x555555c15480, offset=0, offset@entry=-1, traces=<optimized out>) at addr2line.c:1763
#5 0x00005555557bc069 in rb_dump_backtrace_with_lines.constprop.0 (num_traces=16, traces=<optimized out>) at addr2line.c:2150
#6 0x00005555557a8810 in rb_print_backtrace () at vm_dump.c:759
#7 0x00005555557acf21 in rb_vm_bugreport (ctx=<optimized out>) at vm_dump.c:955
#8 0x00005555555f59a4 in rb_bug_for_fatal_signal (default_sighandler=0x0, sig=11, ctx=0x555555bd6180, fmt=0x5555557ff3fa "Segmentation fault at %p") at error.c:675
#9 0x000055555571ac29 in sigsegv (sig=11, info=0x555555bd62b0, ctx=0x555555bd6180) at signal.c:959
#10 <signal handler called>
#11 0x00007ffff7a23237 in kill () at ../sysdeps/unix/syscall-template.S:78
#12 0x000055555571ec9b in rb_f_kill (argc=2, argv=0x7ffff7ecb048) at signal.c:480
#13 0x0000555555789c44 in vm_call_cfunc_with_frame (ec=0x555555add590, reg_cfp=0x7ffff7fcafa0, calling=<optimized out>, cd=<optimized out>) at vm_insnhelper.c:2601
#14 0x000055555578d4d5 in vm_sendish (ec=0x555555add590, reg_cfp=0x7ffff7fcafa0, cd=0x555555bfaf30, block_handler=<optimized out>, method_explorer=<optimized out>) at vm_callinfo.h:337
#15 0x000055555578ef98 in vm_exec_core (ec=0x555555add590, initial=11) at insns.def:799
#16 0x00005555557a4ee4 in rb_vm_exec (ec=0x555555add590, mjit_enable_p=1) at vm.c:1953
#17 0x00005555555ff1d0 in rb_ec_exec_node (ec=ec@entry=0x555555add590, n=n@entry=0x555555ae9958) at eval.c:296
#18 0x0000555555603533 in ruby_run_node (n=0x555555ae9958) at eval.c:354
#19 0x000055555557ee0f in main (argc=<optimized out>, argv=<optimized out>) at main.c:50
(gdb)
It seems the generated DWARF section is broken. For instance addr2line(1)
also fails to understand it.
% nm ./miniruby | fgrep -w rb_f_kill | LC_ALL=C addr2line -e ./miniruby
addr2line: Dwarf Error: Could not find abbrev number 64.
??:?
:?
When you kill LTO option the above one liner must show "signal.c:423" or something.
@vo.x (Vit Ondruch) is it possible for you to ask this to linker people instead? As addr2line(1)
is also affected, it is hard for me to think we are the ones who is doing something wrong.
Updated by shyouhei (Shyouhei Urabe) over 4 years ago
Meanwhile it can not be a bad idea for us to avoid SEGV even when DWARF is broken.
Updated by vo.x (Vit Ondruch) over 4 years ago
Thank you for the investigation. There is a whole thread on Fedora devel mailing list:
I have posted results of your investigations there and I think Jeff Law will be back soon with some advice/fix.
Updated by vo.x (Vit Ondruch) over 4 years ago
vo.x (Vit Ondruch) wrote in #note-4:
Thank you for the investigation. There is a whole thread on Fedora devel mailing list:
I have posted results of your investigations there
This is the link:
and I think Jeff Law will be back soon with some advice/fix.
Updated by vo.x (Vit Ondruch) over 3 years ago
This is still unresolved :(
Updated by vo.x (Vit Ondruch) over 3 years ago
And it seems that the DWARF is in better shape:
$ nm ./miniruby | grep -w rb_f_kill
00000000001ddcf0 T rb_f_kill
000000000002bd88 t rb_f_kill.cold
$ nm ./miniruby | grep -w rb_f_kill | addr2line -e ./miniruby
/builddir/build/BUILD/ruby-3.0.2/signal.c:424
/builddir/build/BUILD/ruby-3.0.2/numeric.c:2980
Updated by vo.x (Vit Ondruch) over 3 years ago
- Status changed from Third Party's Issue to Open
Changing state back to open, because this is unresolved and I am repeatedly hitting this issue in various reincarnations. So this is the debug session on PPC64LE:
$ gdb --args ./miniruby -e'Process.kill("SIGSEGV",$$)'
GNU gdb (GDB) Fedora 10.2-6.fc35
Copyright (C) 2021 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "ppc64le-redhat-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ./miniruby...
warning: File "/builddir/build/BUILD/ruby-3.0.2/.gdbinit" auto-loading has been declined by your `auto-load safe-path' set to "$debugdir:$datadir/auto-load".
To enable execution of this file add
add-auto-load-safe-path /builddir/build/BUILD/ruby-3.0.2/.gdbinit
line to your configuration file "/builddir/.gdbinit".
To completely disable this security protection add
set auto-load safe-path /
line to your configuration file "/builddir/.gdbinit".
For more information about this security protection see the
"Auto-loading safe path" section in the GDB manual. E.g., run from the shell:
info "(gdb)Auto-loading safe path"
(gdb) r
Starting program: /builddir/build/BUILD/ruby-3.0.2/miniruby -eProcess.kill\(\"SIGSEGV\",\$\$\)
Download failed: No route to host. Continuing without debug info for /lib64/libz.so.1.
Download failed: No route to host. Continuing without debug info for /lib64/libgmp.so.10.
Download failed: No route to host. Continuing without debug info for /lib64/libcrypt.so.2.
Download failed: No route to host. Continuing without debug info for /lib64/libm.so.6.
Download failed: No route to host. Continuing without debug info for /lib64/libc.so.6.
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Program received signal SIGSEGV, Segmentation fault.
0x00007ffff7aa8810 in kill () from /lib64/libc.so.6
Missing separate debuginfos, use: dnf debuginfo-install glibc-2.34-1.fc35.ppc64le gmp-6.2.0-7.fc35.ppc64le libxcrypt-4.4.24-1.fc35.ppc64le zlib-1.2.11-30.fc35.ppc64le
(gdb) c
Continuing.
-e:1: [BUG] Segmentation fault at 0x590fb15c0000001f
ruby 3.0.2p107 (2021-07-07 revision 0db68f0233) [powerpc64le-linux]
-- Control frame information -----------------------------------------------
c:0003 p:---- s:0012 e:000011 CFUNC :kill
c:0002 p:0015 s:0006 e:000005 EVAL -e:1 [FINISH]
c:0001 p:0000 s:0003 E:0013f0 (none) [FINISH]
-- Ruby level backtrace information ----------------------------------------
-e:1:in `<main>'
-e:1:in `kill'
-- C level backtrace information -------------------------------------------
Program received signal SIGSEGV, Segmentation fault.
0x000000010031fa44 in uleb128 (p=0x1005986a0) at addr2line.c:200
200 unsigned char b = *(unsigned char *)(*p)++;
(gdb) bt
#0 0x000000010031fa44 in uleb128 (p=0x1005986a0) at addr2line.c:200
#1 di_read_die (reader=0x100598668, die=0x100598588) at addr2line.c:1335
#2 0x0000000100337d4c in read_abstract_origin (line=0x1005985f8, abstract_origin=760312297580551, reader=0x100598668) at addr2line.c:1604
#3 debug_info_read (traces=<optimized out>, offset=<optimized out>, lines=<optimized out>, num_traces=<optimized out>, reader=<optimized out>) at addr2line.c:1668
#4 fill_lines (num_traces=num_traces@entry=19, check_debuglink=check_debuglink@entry=1, objp=0x100599010, objp@entry=0x1005990a0, lines=lines@entry=0x1005e2c50, offset=0, offset@entry=-1,
traces=<optimized out>) at addr2line.c:1888
#5 0x0000000100338d1c in rb_dump_backtrace_with_lines.constprop.0 (num_traces=<optimized out>, traces=<optimized out>) at addr2line.c:2287
#6 0x000000010031dfec in rb_print_backtrace () at vm_dump.c:760
#7 0x00000001003356a8 in rb_vm_bugreport.constprop.0 (ctx=<optimized out>) at vm_dump.c:998
#8 0x00000001000cd8c4 in rb_bug_for_fatal_signal (default_sighandler=0x0, sig=<optimized out>, ctx=<optimized out>, fmt=0x100383a98 "Segmentation fault at %p") at error.c:786
#9 0x0000000100256388 in sigsegv (sig=<optimized out>, info=0x10059a330, ctx=0x1005995b0) at signal.c:960
#10 <signal handler called>
#11 0x00007ffff7aa8810 in kill () from /lib64/libc.so.6
#12 0x000000010025ad70 in rb_f_kill (argc=<optimized out>, argv=0x7ffff78e0050) at signal.c:439
#13 0x00000001001fb448 in proc_rb_f_kill (c=<optimized out>, v=<optimized out>, _=<optimized out>) at process.c:8605
#14 0x00000001002e92d8 in ractor_safe_call_cfunc_m1 (recv=<optimized out>, argc=<optimized out>, argv=<optimized out>, func=<optimized out>) at vm_insnhelper.c:2739
#15 0x00000001002f2120 in vm_call_cfunc_with_frame (ec=0x100491ac0, reg_cfp=0x7fffffffe170, calling=<optimized out>) at vm_insnhelper.c:2929
#16 0x00000001002f4f34 in vm_sendish (ec=0x100491ac0, reg_cfp=0x7ffff79dffa0, cd=0x1005c2eb0, block_handler=<optimized out>, method_explorer=<optimized out>) at vm_insnhelper.c:4530
#17 0x00000001002fa2ac in vm_exec_core (ec=0x100491ac0, initial=<optimized out>) at insns.def:789
#18 0x0000000100315200 in rb_vm_exec (ec=0x100491ac0, mjit_enable_p=<optimized out>) at vm.c:2172
#19 0x0000000100317040 in rb_iseq_eval_main (iseq=0x1004a9fb0) at vm.c:2420
#20 0x00000001000d7c8c in rb_ec_exec_node (ec=ec@entry=0x100491ac0, n=n@entry=0x1004a9fb0) at eval.c:317
#21 0x00000001000d7df4 in ruby_run_node (n=0x1004a9fb0) at eval.c:375
#22 0x000000010002afb8 in main (argc=<optimized out>, argv=<optimized out>) at ./main.c:50
(gdb)
Updated by xtkoba (Tee KOBAYASHI) over 3 years ago
I noticed that addr2line.c
misinterpreted the size of DW_FORM_ref_addr
as the size of an address. In fact it should be interpreted as the size of an offset (32 bits in 32-bit DWARF and 64 bits in 64-bit DWARF) [1]. A patch is attached to fix this.
[1] http://wiki.dwarfstd.org/index.php?title=DWARF_FAQ#How_big_is_a_DW_FORM_ref_addr.3F
I am not sure if this patch resolves the main issue, because it fails to reproduce on my x86_64-linux environment. A noticeable difference between LTO and non-LTO binaries on x86_64-linux is that DW_AT_abstract_origin
attribute is of the form DW_FORM_ref_addr
in the former and DW_FORM_ref4
in the latter.
Note that there is another size misinterpretation in addr2line.c
, which is fixed in the patch for #17823. I recommend it be applied also.
Updated by xtkoba (Tee KOBAYASHI) over 3 years ago
Now I can reproduce this on my qemu-user-emulated powerpc64le-linux-musl environment:
$ powerpc64le-musl-qemu -strace ./miniruby -e 'Process.kill :SEGV, $$'
(...)
24209 readlink("/proc/self/exe",0x0000004000490bd8,4096) = 75
24209 open("/var/tmp/ruby/xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/miniruby",O_RDONLY|O_LARGEFILE) = 5
24209 _llseek(5,0,0,0x0000004004160a70,SEEK_END) = 0
24209 _llseek(5,0,0,0x0000004004160a70,SEEK_SET) = 0
24209 mmap(NULL,24128544,PROT_READ,MAP_SHARED,5,0) = 0x000000400566b000
24209 close(5) = 0
--- SIGSEGV {si_signo=SIGSEGV, si_code=1, si_addr=NULL} ---
qemu: uncaught target signal 11 (Segmentation fault) - core dumped
Segmentation fault
And it turns out that the patch attached in #note-9 is not enough to resolve this. I'm going to try to debug addr2line.c
further.
Updated by xtkoba (Tee KOBAYASHI) over 3 years ago
It turns out that DW_FORM_ref_addr
is not interpreted correctly. A patch is attached to just ignore that form (and other forms that are not supposed to be used currently in addr2line.c
). This ignorance does not seem to affect the quality of C backtrace printing very much, and so I am not going to add the interpretation of currently ignored forms for now.
FYI, the forms that can be used for a value of DW_AT_abstract_origin
are described in 7.5.5 of DWARF 5 specification (pp.217--218 in DWARF5.pdf).
Updated by xtkoba (Tee KOBAYASHI) over 3 years ago
- Status changed from Open to Closed
Applied in changeset git|72317b333b85eed483ad00bcd4f40944019a7c13.
Ignore DW_FORM_ref_addr
[Bug #17052]
Ignore DW_FORM_ref_addr
form and other forms that are not supposed
to be used currently.
Updated by vo.x (Vit Ondruch) over 3 years ago
- Status changed from Closed to Open
Thank you for looking into this.
Unfortunately, I have tried git|72317b333b85eed483ad00bcd4f40944019a7c13 and I see no difference.
BTW I test Ruby 3.0.2 including the following commits:
Updated by xtkoba (Tee KOBAYASHI) over 3 years ago
Aren't there any subtle differences in GDB outputs? If fixed as intended, the value of abstract_origin
can't get such large as 760312297580551 (0x2b380006a6807).
Updated by vo.x (Vit Ondruch) over 3 years ago
This is the GDB output:
(gdb) r
Starting program: /builddir/build/BUILD/ruby-3.0.2/miniruby -e Process.kill\ :SEGV,\ \$\$
Download failed: No route to host. Continuing without debug info for /lib64/libz.so.1.
Download failed: No route to host. Continuing without debug info for /lib64/libgmp.so.10.
Download failed: No route to host. Continuing without debug info for /lib64/libcrypt.so.2.
Download failed: No route to host. Continuing without debug info for /lib64/libm.so.6.
Download failed: No route to host. Continuing without debug info for /lib64/libc.so.6.
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Program received signal SIGSEGV, Segmentation fault.
0x00007ffff7aa8810 in kill () from /lib64/libc.so.6
Missing separate debuginfos, use: dnf debuginfo-install glibc-2.34-1.fc35.ppc64le gmp-6.2.0-7.fc35.ppc64le libxcrypt-4.4.25-1.fc35.ppc64le zlib-1.2.11-30.fc35.ppc64le
(gdb) c
Continuing.
-e:1: [BUG] Segmentation fault at 0x590fb15c0000001f
ruby 3.0.2p107 (2021-07-07 revision 0db68f0233) [powerpc64le-linux]
-- Control frame information -----------------------------------------------
c:0003 p:---- s:0012 e:000011 CFUNC :kill
c:0002 p:0015 s:0006 e:000005 EVAL -e:1 [FINISH]
c:0001 p:0000 s:0003 E:0013f0 (none) [FINISH]
-- Ruby level backtrace information ----------------------------------------
-e:1:in `<main>'
-e:1:in `kill'
-- C level backtrace information -------------------------------------------
Program received signal SIGSEGV, Segmentation fault.
0x000000010031fb00 in uleb128 (p=0x1005986c0) at addr2line.c:200
200 unsigned char b = *(unsigned char *)(*p)++;
(gdb) bt
#0 0x000000010031fb00 in uleb128 (p=0x1005986c0) at addr2line.c:200
#1 di_read_die (reader=0x100598668, die=0x100598578) at addr2line.c:1343
#2 0x00000001003379a8 in debug_info_read (traces=<optimized out>, offset=<optimized out>, lines=<optimized out>, num_traces=<optimized out>, reader=<optimized out>) at addr2line.c:1650
#3 fill_lines (num_traces=num_traces@entry=19, check_debuglink=check_debuglink@entry=1, objp=0x100599010, objp@entry=0x1005990a0, lines=lines@entry=0x1005e2c20, offset=0, offset@entry=-1,
traces=<optimized out>) at addr2line.c:1907
#4 0x0000000100338cfc in rb_dump_backtrace_with_lines.constprop.0 (num_traces=<optimized out>, traces=<optimized out>) at addr2line.c:2306
#5 0x000000010031df2c in rb_print_backtrace () at vm_dump.c:760
#6 0x0000000100335668 in rb_vm_bugreport.constprop.0 (ctx=<optimized out>) at vm_dump.c:998
#7 0x00000001000cd8a4 in rb_bug_for_fatal_signal (default_sighandler=0x0, sig=<optimized out>, ctx=<optimized out>, fmt=0x100383a78 "Segmentation fault at %p") at error.c:786
#8 0x00000001002562e8 in sigsegv (sig=<optimized out>, info=0x10059a330, ctx=0x1005995b0) at signal.c:960
#9 <signal handler called>
#10 0x00007ffff7aa8810 in kill () from /lib64/libc.so.6
#11 0x000000010025acd0 in rb_f_kill (argc=<optimized out>, argv=0x7ffff78e0050) at signal.c:439
#12 0x00000001001fb3a8 in proc_rb_f_kill (c=<optimized out>, v=<optimized out>, _=<optimized out>) at process.c:8605
#13 0x00000001002e9238 in ractor_safe_call_cfunc_m1 (recv=<optimized out>, argc=<optimized out>, argv=<optimized out>, func=<optimized out>) at vm_insnhelper.c:2739
#14 0x00000001002f2080 in vm_call_cfunc_with_frame (ec=0x100491ac0, reg_cfp=0x7fffffffe170, calling=<optimized out>) at vm_insnhelper.c:2929
#15 0x00000001002f4e94 in vm_sendish (ec=0x100491ac0, reg_cfp=0x7ffff79dffa0, cd=0x1005c2e80, block_handler=<optimized out>, method_explorer=<optimized out>) at vm_insnhelper.c:4530
#16 0x00000001002fa20c in vm_exec_core (ec=0x100491ac0, initial=<optimized out>) at insns.def:789
#17 0x0000000100315140 in rb_vm_exec (ec=0x100491ac0, mjit_enable_p=<optimized out>) at vm.c:2172
#18 0x0000000100316f80 in rb_iseq_eval_main (iseq=0x1004a9fd8) at vm.c:2420
#19 0x00000001000d7c6c in rb_ec_exec_node (ec=ec@entry=0x100491ac0, n=n@entry=0x1004a9fd8) at eval.c:317
#20 0x00000001000d7dd4 in ruby_run_node (n=0x1004a9fd8) at eval.c:375
#21 0x000000010002afb8 in main (argc=<optimized out>, argv=<optimized out>) at ./main.c:50
(gdb) l uleb128
file: "addr2line.c", line number: 195, symbol: "uleb128"
190
191 /* Avoid consuming stack as this module may be used from signal handler */
192 static char binary_filename[PATH_MAX + 1];
193
194 static unsigned long
195 uleb128(char **p)
196 {
197 unsigned long r = 0;
198 int s = 0;
199 for (;;) {
file: "addr2line.c", line number: 197, symbol: "uleb128"
192 static char binary_filename[PATH_MAX + 1];
193
194 static unsigned long
195 uleb128(char **p)
196 {
197 unsigned long r = 0;
198 int s = 0;
199 for (;;) {
200 unsigned char b = *(unsigned char *)(*p)++;
201 if (b < 0x80) {
file: "addr2line.c", line number: 200, symbol: "uleb128"
195 uleb128(char **p)
196 {
197 unsigned long r = 0;
198 int s = 0;
199 for (;;) {
200 unsigned char b = *(unsigned char *)(*p)++;
201 if (b < 0x80) {
202 r += (unsigned long)b << s;
203 break;
204 }
file: "addr2line.c", line number: 201, symbol: "uleb128"
196 {
197 unsigned long r = 0;
198 int s = 0;
199 for (;;) {
200 unsigned char b = *(unsigned char *)(*p)++;
201 if (b < 0x80) {
202 r += (unsigned long)b << s;
203 break;
204 }
205 r += (b & 0x7f) << s;
(gdb)
206 s += 7;
207 }
208 return r;
209 }
210
211 static long
212 sleb128(char **p)
213 {
214 long r = 0;
215 int s = 0;
(gdb) l di_read_die
1328 }
1329 #endif
1330
1331 static DIE *
1332 di_read_die(DebugInfoReader *reader, DIE *die)
1333 {
1334 uint64_t abbrev_number = uleb128(&reader->p);
1335 if (abbrev_number == 0) {
1336 reader->level--;
1337 return NULL;
(gdb)
1338 }
1339
1340 reader->q = di_find_abbrev(reader, abbrev_number);
1341
1342 die->pos = reader->p - reader->obj->debug_info.ptr - 1;
1343 die->tag = (int)uleb128(&reader->q); /* tag */
1344 die->has_children = *reader->q++; /* has_children */
1345 if (die->has_children) {
1346 reader->level++;
1347 }
(gdb)
1348 return die;
1349 }
1350
1351 static DebugInfoValue *
1352 di_read_record(DebugInfoReader *reader, DebugInfoValue *vp)
1353 {
1354 uint64_t at = uleb128(&reader->q);
1355 uint64_t form = uleb128(&reader->q);
1356 if (!at || !form) return NULL;
1357 vp->at = at;
(gdb) l debug_info_read
1643 debug_info_read(DebugInfoReader *reader, int num_traces, void **traces,
1644 line_info_t *lines, int offset) {
1645 while (reader->p < reader->cu_end) {
1646 DIE die;
1647 ranges_t ranges = {};
1648 line_info_t line = {};
1649
1650 if (!di_read_die(reader, &die)) continue;
1651 /* fprintf(stderr,"%d:%tx: <%d>\n",__LINE__,die.pos,reader->level,die.tag); */
1652
(gdb)
1653 if (die.tag != DW_TAG_subprogram && die.tag != DW_TAG_inlined_subroutine) {
1654 skip_die:
1655 di_skip_records(reader);
1656 continue;
1657 }
1658
1659 /* enumerate abbrev */
1660 for (;;) {
1661 DebugInfoValue v = {{}};
1662 /* ptrdiff_t pos = reader->p - reader->p0; */
(gdb)
1663 if (!di_read_record(reader, &v)) break;
1664 /* fprintf(stderr,"\n%d:%tx: AT:%lx FORM:%lx\n",__LINE__,pos,v.at,v.form); */
1665 /* div_inspect(&v); */
1666 switch (v.at) {
1667 case DW_AT_name:
1668 line.sname = get_cstr_value(&v);
1669 break;
1670 case DW_AT_call_file:
1671 fill_filename((int)v.as.uint64, reader->debug_line_directories, reader->debug_line_files, &line, reader->obj);
1672 break;
(gdb)
1673 case DW_AT_call_line:
1674 line.line = (int)v.as.uint64;
1675 break;
1676 case DW_AT_low_pc:
1677 case DW_AT_high_pc:
1678 case DW_AT_ranges:
1679 ranges_set(&ranges, &v);
1680 break;
1681 case DW_AT_declaration:
1682 goto skip_die;
(gdb)
1683 case DW_AT_inline:
1684 /* 1 or 3 */
1685 break; /* goto skip_die; */
1686 case DW_AT_abstract_origin:
1687 read_abstract_origin(reader, v.form, v.as.uint64, &line);
1688 break; /* goto skip_die; */
1689 }
1690 }
1691 /* ranges_inspect(reader, &ranges); */
1692 /* fprintf(stderr,"%d:%tx: %x ",__LINE__,diepos,die.tag); */
(gdb)
1693 for (int i=offset; i < num_traces; i++) {
1694 uintptr_t addr = (uintptr_t)traces[i];
1695 uintptr_t offset = addr - reader->obj->base_addr + reader->obj->vmaddr;
1696 uintptr_t saddr = ranges_include(reader, &ranges, offset);
1697 if (saddr) {
1698 /* fprintf(stderr, "%d:%tx: %d %lx->%lx %x %s: %s/%s %d %s %s %s\n",__LINE__,die.pos, i,addr,offset, die.tag,line.sname,line.dirname,line.filename,line.line,reader->obj->path,line.sname,lines[i].sname); */
1699 if (lines[i].sname) {
1700 line_info_t *lp = malloc(sizeof(line_info_t));
1701 memcpy(lp, &lines[i], sizeof(line_info_t));
1702 lines[i].next = lp;
(gdb)
1703 lp->dirname = line.dirname;
1704 lp->filename = line.filename;
1705 lp->line = line.line;
1706 lp->saddr = 0;
1707 }
1708 lines[i].path = reader->obj->path;
1709 lines[i].base_addr = line.base_addr;
1710 lines[i].sname = line.sname;
1711 lines[i].saddr = saddr + reader->obj->base_addr - reader->obj->vmaddr;
1712 }
(gdb)
1713 }
1714 }
1715 }
1716
1717 #ifdef USE_ELF
1718 static unsigned long
1719 uncompress_debug_section(ElfW(Shdr) *shdr, char *file, char **ptr)
1720 {
1721 #ifdef SUPPORT_COMPRESSED_DEBUG_LINE
1722 ElfW(Chdr) *chdr = (ElfW(Chdr) *)(file + shdr->sh_offset);
(gdb)
Updated by vo.x (Vit Ondruch) over 3 years ago
Ah, I can see there is missing the read_abstract_origin
on the backtrace.
Updated by xtkoba (Tee KOBAYASHI) over 3 years ago
It turns out that di_read_debug_abbrev_cu
in addr2line.c
does not work correctly when abbrev_number
is not aligned in ascending order for each CU entry of .debug_abbrev
. A patch is attached to fix this.
Note again that I cannot reproduce this in my environment and so this patch does not necessarily resolve the main problem. Thank you for your patience.
Updated by xtkoba (Tee KOBAYASHI) over 3 years ago
A revised patch (from #note-17) is attached so that abbrev_number >= 256
is handled properly, although it might be practically not going to happen.
Updated by xtkoba (Tee KOBAYASHI) over 3 years ago
FWIW, another bug is found in addr2line.c
(#18081), which I believe is irrevant to the issue here, though.
Updated by vo.x (Vit Ondruch) over 3 years ago
The result with the latest patch is significantly different:
$ gdb --args ./miniruby -e'Process.kill("SIGSEGV",$$)'
GNU gdb (GDB) Fedora 10.2-6.fc35
Copyright (C) 2021 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "ppc64le-redhat-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ./miniruby...
warning: File "/builddir/build/BUILD/ruby-3.0.2/.gdbinit" auto-loading has been declined by your `auto-load safe-path' set to "$debugdir:$datadir/auto-load".
To enable execution of this file add
add-auto-load-safe-path /builddir/build/BUILD/ruby-3.0.2/.gdbinit
line to your configuration file "/builddir/.gdbinit".
To completely disable this security protection add
set auto-load safe-path /
line to your configuration file "/builddir/.gdbinit".
For more information about this security protection see the
"Auto-loading safe path" section in the GDB manual. E.g., run from the shell:
info "(gdb)Auto-loading safe path"
(gdb) r
Starting program: /builddir/build/BUILD/ruby-3.0.2/miniruby -eProcess.kill\(\"SIGSEGV\",\$\$\)
Download failed: No route to host. Continuing without debug info for /lib64/libz.so.1.
Download failed: No route to host. Continuing without debug info for /lib64/libgmp.so.10.
Download failed: No route to host. Continuing without debug info for /lib64/libcrypt.so.2.
Download failed: No route to host. Continuing without debug info for /lib64/libm.so.6.
Download failed: No route to host. Continuing without debug info for /lib64/libc.so.6.
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Program received signal SIGSEGV, Segmentation fault.
0x00007ffff7aa8810 in kill () from /lib64/libc.so.6
Missing separate debuginfos, use: dnf debuginfo-install glibc-2.34-1.fc35.ppc64le gmp-6.2.0-7.fc35.ppc64le libxcrypt-4.4.25-1.fc35.ppc64le zlib-1.2.11-30.fc35.ppc64le
(gdb) c
Continuing.
-e:1: [BUG] Segmentation fault at 0x590fb15c00000020
ruby 3.0.2p107 (2021-07-07 revision 0db68f0233) [powerpc64le-linux]
-- Control frame information -----------------------------------------------
c:0003 p:---- s:0012 e:000011 CFUNC :kill
c:0002 p:0015 s:0006 e:000005 EVAL -e:1 [FINISH]
c:0001 p:0000 s:0003 E:0013f0 (none) [FINISH]
-- Ruby level backtrace information ----------------------------------------
-e:1:in `<main>'
-e:1:in `kill'
-- C level backtrace information -------------------------------------------
1267: Abbrev Number 106 not found
Program received signal SIGABRT, Aborted.
0x00007ffff7b09864 in __pthread_kill_internal () from /lib64/libc.so.6
(gdb) bt
#0 0x00007ffff7b09864 in __pthread_kill_internal () from /lib64/libc.so.6
#1 0x00007ffff7aa848c in raise () from /lib64/libc.so.6
#2 0x00007ffff7a81404 in abort () from /lib64/libc.so.6
#3 0x000000010031fe70 in di_abbrev_table (abbrev_number=106, reader=0x100598660) at addr2line.c:1268
#4 di_find_abbrev (abbrev_number=106, reader=0x100598660) at addr2line.c:1279
#5 di_read_die (reader=0x100598660, die=0x100598570) at addr2line.c:1349
#6 0x0000000100337a3c in debug_info_read (traces=<optimized out>, offset=<optimized out>, lines=<optimized out>, num_traces=<optimized out>, reader=<optimized out>) at addr2line.c:1658
#7 fill_lines (num_traces=num_traces@entry=19, check_debuglink=check_debuglink@entry=1, objp=0x100599010, objp@entry=0x1005990a0, lines=lines@entry=0x1005e2c50, offset=0, offset@entry=-1,
traces=<optimized out>) at addr2line.c:1915
#8 0x0000000100338d9c in rb_dump_backtrace_with_lines.constprop.0 (num_traces=<optimized out>, traces=<optimized out>) at addr2line.c:2314
#9 0x000000010031e04c in rb_print_backtrace () at vm_dump.c:760
#10 0x00000001003353d8 in rb_vm_bugreport.constprop.0 (ctx=<optimized out>) at vm_dump.c:998
#11 0x00000001000cd884 in rb_bug_for_fatal_signal (default_sighandler=0x0, sig=<optimized out>, ctx=<optimized out>, fmt=0x100383b18 "Segmentation fault at %p") at error.c:786
#12 0x00000001002563c8 in sigsegv (sig=<optimized out>, info=0x10059a330, ctx=0x1005995b0) at signal.c:960
#13 <signal handler called>
#14 0x00007ffff7aa8810 in kill () from /lib64/libc.so.6
#15 0x000000010025adb0 in rb_f_kill (argc=<optimized out>, argv=0x7ffff78e0050) at signal.c:439
#16 0x00000001001fb488 in proc_rb_f_kill (c=<optimized out>, v=<optimized out>, _=<optimized out>) at process.c:8605
#17 0x00000001002e9358 in ractor_safe_call_cfunc_m1 (recv=<optimized out>, argc=<optimized out>, argv=<optimized out>, func=<optimized out>) at vm_insnhelper.c:2739
#18 0x00000001002f21a0 in vm_call_cfunc_with_frame (ec=0x100491ac0, reg_cfp=0x7fffffffe170, calling=<optimized out>) at vm_insnhelper.c:2929
#19 0x00000001002f4fb4 in vm_sendish (ec=0x100491ac0, reg_cfp=0x7ffff79dffa0, cd=0x1005c2eb0, block_handler=<optimized out>, method_explorer=<optimized out>) at vm_insnhelper.c:4530
#20 0x00000001002fa32c in vm_exec_core (ec=0x100491ac0, initial=<optimized out>) at insns.def:789
#21 0x0000000100315260 in rb_vm_exec (ec=0x100491ac0, mjit_enable_p=<optimized out>) at vm.c:2172
#22 0x00000001003170a0 in rb_iseq_eval_main (iseq=0x1004a9fb0) at vm.c:2420
#23 0x00000001000d7c4c in rb_ec_exec_node (ec=ec@entry=0x100491ac0, n=n@entry=0x1004a9fb0) at eval.c:317
#24 0x00000001000d7db4 in ruby_run_node (n=0x1004a9fb0) at eval.c:375
#25 0x000000010002afb8 in main (argc=<optimized out>, argv=<optimized out>) at ./main.c:50
(gdb)
But probably still not the right output though. I'd still expect some backtrace reported.
Updated by xtkoba (Tee KOBAYASHI) over 3 years ago
The message
1267: Abbrev Number 106 not found
means that addr2line.c
could not find an abbrev table entry in .debug_abbrev
referenced from .debug_info
. Isn't it possible that either section is broken?
It might be helpful if can investigate those sections, in binary or objdump
(from GNU Binutils) text output of
objdump -Wi miniruby
objdump -Wa miniruby
or llvm-dwarfdump
output of
llvm-dwarfdump --debug-info miniruby
llvm-dwarfdump --debug-abbrev miniruby
Updated by vo.x (Vit Ondruch) over 3 years ago
xtkoba (Tee KOBAYASHI) wrote in #note-21:
Isn't it possible that either section is broken?
Well, it should be DWARF5, which is pretty recent and on ppc64le, so there might be some surprises ...
It might be helpful if can investigate those sections, in binary or
objdump
(from GNU Binutils) text output of
The dumps should be available here: https://vondruch.fedorapeople.org/objdump.tar.gz
Updated by xtkoba (Tee KOBAYASHI) over 3 years ago
vo.x (Vit Ondruch) wrote in #note-22:
The dumps should be available here: https://vondruch.fedorapeople.org/objdump.tar.gz
I'm afraid that this contains only .debug_info
and not .debug_abbrev
.
It seems that objdump
can dump the .debug_info
section without error, so maybe it is not broken. And of course it is probable that addr2line.c
makes wrong assumptions on DWARF format.
Updated by vo.x (Vit Ondruch) over 3 years ago
I have recreated and reuploaded the tarball. Same URL should work.
Updated by xtkoba (Tee KOBAYASHI) over 3 years ago
- File ruby-addr2line-debug_info_read-debug-code-0001.patch ruby-addr2line-debug_info_read-debug-code-0001.patch added
Nothing strange found in .debug_abbrev
, either.
A patch is attached to help ease debugging.
Updated by vo.x (Vit Ondruch) about 3 years ago
(gdb) c
Continuing.
-e:1: [BUG] Segmentation fault at 0x590fb15c00000020
ruby 3.0.2p107 (2021-07-07 revision 0db68f0233) [powerpc64le-linux]
-- Control frame information -----------------------------------------------
c:0003 p:---- s:0012 e:000011 CFUNC :kill
c:0002 p:0015 s:0006 e:000005 EVAL -e:1 [FINISH]
c:0001 p:0000 s:0003 E:0013f0 (none) [FINISH]
-- Ruby level backtrace information ----------------------------------------
-e:1:in `<main>'
-e:1:in `kill'
-- C level backtrace information -------------------------------------------
1273: debug_info 0x54 Abbrev Number 106 not found
Program received signal SIGABRT, Aborted.
0x00007ffff7b09864 in __pthread_kill_internal () from /lib64/libc.so.6
(gdb) bt
#0 0x00007ffff7b09864 in __pthread_kill_internal () from /lib64/libc.so.6
#1 0x00007ffff7aa848c in raise () from /lib64/libc.so.6
#2 0x00007ffff7a81404 in abort () from /lib64/libc.so.6
#3 0x000000010031fe70 in di_abbrev_table (abbrev_number=106, reader=0x100598660) at addr2line.c:1274
#4 di_find_abbrev (abbrev_number=106, reader=0x100598660) at addr2line.c:1285
#5 di_read_die (reader=0x100598660, die=0x100598570) at addr2line.c:1355
#6 0x00000001003370cc in debug_info_read (traces=<optimized out>, offset=<optimized out>, lines=<optimized out>, num_traces=<optimized out>, reader=<optimized out>) at addr2line.c:1664
#7 fill_lines (num_traces=num_traces@entry=19, check_debuglink=check_debuglink@entry=1, objp=0x100599010, objp@entry=0x1005990a0, lines=lines@entry=0x1005e2c50, offset=<optimized out>, offset@entry=-1,
traces=<optimized out>) at addr2line.c:1929
#8 0x0000000100338dec in rb_dump_backtrace_with_lines.constprop.0 (num_traces=<optimized out>, traces=<optimized out>) at addr2line.c:2328
#9 0x000000010031e04c in rb_print_backtrace () at vm_dump.c:760
#10 0x0000000100335418 in rb_vm_bugreport.constprop.0 (ctx=<optimized out>) at vm_dump.c:998
#11 0x00000001000cd884 in rb_bug_for_fatal_signal (default_sighandler=0x0, sig=<optimized out>, ctx=<optimized out>, fmt=0x100383b58 "Segmentation fault at %p") at error.c:786
#12 0x0000000100256408 in sigsegv (sig=<optimized out>, info=0x10059a330, ctx=0x1005995b0) at signal.c:960
#13 <signal handler called>
#14 0x00007ffff7aa8810 in kill () from /lib64/libc.so.6
#15 0x000000010025adf0 in rb_f_kill (argc=<optimized out>, argv=0x7ffff78e0050) at signal.c:439
#16 0x00000001001fb4c8 in proc_rb_f_kill (c=<optimized out>, v=<optimized out>, _=<optimized out>) at process.c:8605
#17 0x00000001002e9358 in ractor_safe_call_cfunc_m1 (recv=<optimized out>, argc=<optimized out>, argv=<optimized out>, func=<optimized out>) at vm_insnhelper.c:2739
#18 0x00000001002f21a0 in vm_call_cfunc_with_frame (ec=0x100491ac0, reg_cfp=0x7fffffffe170, calling=<optimized out>) at vm_insnhelper.c:2929
#19 0x00000001002f4fb4 in vm_sendish (ec=0x100491ac0, reg_cfp=0x7ffff79dffa0, cd=0x1005c2eb0, block_handler=<optimized out>, method_explorer=<optimized out>) at vm_insnhelper.c:4530
#20 0x00000001002fa32c in vm_exec_core (ec=0x100491ac0, initial=<optimized out>) at insns.def:789
#21 0x0000000100315260 in rb_vm_exec (ec=0x100491ac0, mjit_enable_p=<optimized out>) at vm.c:2172
#22 0x00000001003170a0 in rb_iseq_eval_main (iseq=0x1004a9fb0) at vm.c:2420
#23 0x00000001000d7c4c in rb_ec_exec_node (ec=ec@entry=0x100491ac0, n=n@entry=0x1004a9fb0) at eval.c:317
#24 0x00000001000d7db4 in ruby_run_node (n=0x1004a9fb0) at eval.c:375
#25 0x000000010002afb8 in main (argc=<optimized out>, argv=<optimized out>) at ./main.c:50
(gdb) f 3
#3 0x000000010031fe70 in di_abbrev_table (abbrev_number=106, reader=0x100598660) at addr2line.c:1274
1274 abort();
(gdb) l
1269 while (a) {
1270 len++;
1271 a >>= 7;
1272 }
1273 fprintf(stderr,"%d: debug_info 0x%"PRIxPTR" Abbrev Number %"PRId64" not found\n",__LINE__, reader->p - reader->obj->debug_info.ptr - len, abbrev_number);
1274 abort();
1275 }
1276 return p;
1277 }
1278
(gdb)
Updated by xtkoba (Tee KOBAYASHI) about 3 years ago
Byte 0x54 of the .debug_info
is the 3rd byte of
<52> DW_AT_abstract_origin: <0x6a6c3a>
of the form DW_FORM_ref_addr
which is in little endian 0x6a == 106 (and also the same value when interpreted as ULEB128). So apparently the executable misreads the .debug_info
.
What feels weird to me is that my executable of miniruby has almost the same pattern as yours at the top portion of the .debug_info
section, and yet does not cause any errors. Possibly addr2line.c
is miscompiled in any way?
Updated by vo.x (Vit Ondruch) about 3 years ago
Just FTR, these are the build options:
gcc -O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -mcpu=power8 -mtune=power8 -fasynchronous-unwind-tables -fstack-clash-protection -fPIC -m64 -D_FORTIFY_SOURCE=2 -fstack-protector-strong -fno-strict-overflow -fvisibility=hidden -fexcess-precision=standard -DRUBY_EXPORT -I. -I.ext/include/powerpc64le-linux -I./include -I. -I./enc/unicode/12.1.0 -o addr2line.o -c addr2line.c
And the GCC version:
$ rpm -q gcc
gcc-11.2.1-1.fc35.ppc64le
$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/libexec/gcc/ppc64le-redhat-linux/11/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none
OFFLOAD_TARGET_DEFAULT=1
Target: ppc64le-redhat-linux
Configured with: ../configure --enable-bootstrap --enable-languages=c,c++,fortran,objc,obj-c++,ada,go,lto --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-shared --enable-threads=posix --enable-checking=release --enable-targets=powerpcle-linux --disable-multilib --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-gnu-unique-object --enable-linker-build-id --with-gcc-major-version-only --with-linker-hash-style=gnu --enable-plugin --enable-initfini-array --with-isl=/builddir/build/BUILD/gcc-11.2.1-20210728/obj-ppc64le-redhat-linux/isl-install --enable-offload-targets=nvptx-none --without-cuda-driver --enable-gnu-indirect-function --enable-secureplt --with-long-double-128 --with-cpu-32=power8 --with-tune-32=power8 --with-cpu-64=power8 --with-tune-64=power8 --build=ppc64le-redhat-linux --with-build-config=bootstrap-lto --enable-link-serialization=1
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 11.2.1 20210728 (Red Hat 11.2.1-1) (GCC)
Updated by xtkoba (Tee KOBAYASHI) about 3 years ago
The behavior seems as if the patch in #note-9 (https://bugs.ruby-lang.org/attachments/8974) were not applied. If I reverse-apply this patch, yes I get the following output:
1300: debug_info 0x54 Abbrev Number 101 not found
Aborted
Updated by vo.x (Vit Ondruch) about 3 years ago
xtkoba (Tee KOBAYASHI) wrote in #note-29:
The behavior seems as if the patch in #note-9
Should it be applied? Based on the #note-10, #note-11 and #note-12, I thought that one is obsolete. That might be misunderstanding on my side, sorry. So let me add it into the mix (I have hig expectations now :D)
Updated by xtkoba (Tee KOBAYASHI) about 3 years ago
I might have been misleading. The patch in #note-9 is necessary for correctly reading DW_FORM_ref_addr
values in .debug_info
sections, even if those values are kept uninterpreted by the patch in #note-11. Both patches have been merged in master already. The rest of the patches I posted here won't be necessary.
Updated by vo.x (Vit Ondruch) about 3 years ago
- Backport changed from 2.5: UNKNOWN, 2.6: UNKNOWN, 2.7: UNKNOWN to 2.6: UNKNOWN, 2.7: UNKNOWN, 3.0: REQUIRED
xtkoba (Tee KOBAYASHI) wrote in #note-31:
I might have been misleading. The patch in #note-9 is necessary for correctly reading
DW_FORM_ref_addr
values in.debug_info
sections, even if those values are kept uninterpreted by the patch in #note-11. Both patches have been merged in master already. The rest of the patches I posted here won't be necessary.
Ah, I am double checking what was committed on master and now I can see that the patch from #note-9 was indeed applied into master, but as part of the git|a9977ba2f9863e3fb1b2346589ebbca67d80536c which made the change invisible. Oh my :/ But this will be helpful for backporting, right? ;)
Nevertheless, I was able to build Ruby on ppc64le with LTO enabled. I'll try to polish the necessary changes tomorrow, send PR against Fedora and check if other arches still works.
Thx a lot for your hard work!
Updated by vo.x (Vit Ondruch) about 3 years ago
xtkoba (Tee KOBAYASHI) wrote in #note-31:
I might have been misleading. The patch in #note-9 is necessary for correctly reading
DW_FORM_ref_addr
values in.debug_info
sections, even if those values are kept uninterpreted by the patch in #note-11. Both patches have been merged in master already. The rest of the patches I posted here won't be necessary.
This is is the Fedora commit:
https://src.fedoraproject.org/rpms/ruby/c/1170d812fe0cd43aeddf7f15a63c2e6db3af8059
The two patches you have mentioned are enough to pass the test suite. I leave the rest to be backported eventually (BTW this was reported against Ruby 2.7 originally and it is likely still affected, but I don't think we are going to enable LTO for Ruby 2.7 at this stage, so the 2.7 backport is low priority for me).
So this is resolved from my POV unless you want to apply the ruby-addr2line-abbrev_table-r1.patch
Thx once again.
Updated by xtkoba (Tee KOBAYASHI) about 3 years ago
The situation ruby-addr2line-abbrev_table-r1.patch
assumes is not likely to happen in the real world. Let's leave it unapplied for now. Thank you.
Updated by jeremyevans0 (Jeremy Evans) about 3 years ago
- Status changed from Open to Closed
Updated by nagachika (Tomoyuki Chikanaga) about 3 years ago
MEMO: Backporting 72317b333b85eed483ad00bcd4f40944019a7c13 depends on https://bugs.ruby-lang.org/issues/17585.