Project

General

Profile

Actions

Bug #18133

open

LTO: TestGCCompact#test_ast_compacts segfaults on i686

Added by vo.x (Vit Ondruch) over 1 year ago. Updated 9 days ago.

Status:
Assigned
Priority:
Normal
Target version:
-
ruby -v:
ruby 3.0.2p107 (2021-07-07 revision 0db68f0233) [i386-linux]
[ruby-core:105069]

Description

I observe following segfault running the test suite on i686 on RHEL9:

$ gdb --args ./miniruby -I./lib -I. -I.ext/common  ./tool/runruby.rb --extout=.ext  -- --disable-gems ./test/runner.rb --excludes-dir=./test/excludes -v

... snip ...

(gdb) handle SIGPIPE noprint nostop pass
Signal        Stop	Print	Pass to program	Description
SIGPIPE       No	No	Yes		Broken pipe
(gdb) r

... snip ...

[ 8347/20497] TestGBK#test_mbc_enc_len = 0.00 s
[ 8348/20497] TestGBK#test_mbc_to_code = 0.00 s
[ 8349/20497] TestGCCompact#test_ast_compacts--Type <RET> for more, q to quit, c to continue without paging--

Thread 1 "ruby" received signal SIGSEGV, Segmentation fault.
0xf7e33fe6 in rb_class_remove_from_super_subclasses (klass=<optimized out>) at /builddir/build/BUILD/ruby-3.0.2/class.c:96
96		    RCLASS_EXT(entry->next->klass)->parent_subclasses = RCLASS_EXT(klass)->parent_subclasses;
(gdb) bt
#0  0xf7e33fe6 in rb_class_remove_from_super_subclasses (klass=<optimized out>) at /builddir/build/BUILD/ruby-3.0.2/class.c:96
#1  obj_free (obj=<optimized out>, objspace=0x5655ac30) at /builddir/build/BUILD/ruby-3.0.2/gc.c:3019
#2  gc_page_sweep (sweep_page=0x5a40e1f0, heap=0x5655ac48, objspace=0x5655ac30) at /builddir/build/BUILD/ruby-3.0.2/gc.c:4914
#3  gc_sweep_step.isra.0 (objspace=<optimized out>, heap=<optimized out>) at /builddir/build/BUILD/ruby-3.0.2/gc.c:5134
#4  0xf7ca3f09 in gc_sweep_rest (objspace=<optimized out>) at /builddir/build/BUILD/ruby-3.0.2/gc.c:5190
#5  gc_sweep (objspace=0x5655ac30) at /builddir/build/BUILD/ruby-3.0.2/gc.c:5313
#6  0xf7ca8250 in gc_marks (full_mark=<optimized out>, objspace=<optimized out>) at /builddir/build/BUILD/ruby-3.0.2/gc.c:7504
#7  gc_start (objspace=<optimized out>, reason=<optimized out>) at /builddir/build/BUILD/ruby-3.0.2/gc.c:8322
#8  0xf7ca8530 in garbage_collect (objspace=objspace@entry=0x5655ac30, reason=reason@entry=238592) at /builddir/build/BUILD/ruby-3.0.2/gc.c:8210
#9  0xf7caa723 in gc_start_internal (compact=2, immediate_sweep=2, immediate_mark=2, full_mark=2, self=1448715280, ec=0x5655afac) at /builddir/build/BUILD/ruby-3.0.2/gc.c:8553
#10 gc_compact (ec=0x5655afac, self=1448715280) at /builddir/build/BUILD/ruby-3.0.2/gc.c:9468
#11 0xf7dfae3c in invoke_bf (argv=0x0, bf=<optimized out>, reg_cfp=<optimized out>, ec=0x5655afac) at /builddir/build/BUILD/ruby-3.0.2/vm_insnhelper.c:5583
#12 vm_invoke_builtin_delegate (ec=0x5655afac, cfp=<optimized out>, bf=<optimized out>, start_index=0) at /builddir/build/BUILD/ruby-3.0.2/vm_insnhelper.c:5607
#13 0xf7e0664c in vm_exec_core (ec=0x0, initial=1448732852) at /builddir/build/BUILD/ruby-3.0.2/insns.def:1482
#14 0xf7e1d0d5 in rb_vm_exec (ec=<optimized out>, mjit_enable_p=<optimized out>) at /builddir/build/BUILD/ruby-3.0.2/vm.c:2172
#15 0xf7e0c3c9 in invoke_block (captured=<optimized out>, captured=<optimized out>, opt_pc=<optimized out>, type=<optimized out>, cref=0x0, self=1450588460, iseq=0x5669174c, ec=0x5655afac)
    at /builddir/build/BUILD/ruby-3.0.2/vm_insnhelper.c:399
#16 invoke_iseq_block_from_c (me=0x0, is_lambda=<optimized out>, cref=0x0, passed_block_handler=0, kw_splat=0, argv=0xffffbf00, argc=1, self=1450588460, captured=<optimized out>, ec=0x5655afac)
    at /builddir/build/BUILD/ruby-3.0.2/vm.c:1335
#17 invoke_block_from_c_bh (force_blockarg=<optimized out>, is_lambda=<optimized out>, cref=<optimized out>, passed_block_handler=<optimized out>, kw_splat=<optimized out>, argv=<optimized out>, 
    argc=<optimized out>, block_handler=<optimized out>, ec=<optimized out>) at /builddir/build/BUILD/ruby-3.0.2/vm.c:1353
#18 vm_yield (kw_splat=0, argv=0xffffbf00, argc=1, ec=0x5655afac) at /builddir/build/BUILD/ruby-3.0.2/vm.c:1398
#19 rb_yield_0 (argv=0xffffbf00, argc=1) at /builddir/build/BUILD/ruby-3.0.2/vm_eval.c:1333
#20 rb_yield (val=<optimized out>) at /builddir/build/BUILD/ruby-3.0.2/vm_eval.c:1349
#21 0xf7c2ae74 in rb_ary_collect (ary=1503666180) at /builddir/build/BUILD/ruby-3.0.2/array.c:3635
#22 0xf7dfc835 in vm_call_cfunc_with_frame (ec=0x5655afac, reg_cfp=0xf77f6d70, calling=0xffffc004) at /builddir/build/BUILD/ruby-3.0.2/vm_insnhelper.c:2929
#23 0xf7dfdd31 in vm_sendish (ec=0x5655afac, reg_cfp=0xf77f6d70, cd=0x566c8f00, block_handler=4152323453, method_explorer=mexp_search_method) at /builddir/build/BUILD/ruby-3.0.2/vm_callinfo.h:336
#24 0xf7e0590a in vm_exec_core (ec=0x0, initial=1448732852) at /builddir/build/BUILD/ruby-3.0.2/insns.def:770
#25 0xf7e1d0d5 in rb_vm_exec (ec=<optimized out>, mjit_enable_p=<optimized out>) at /builddir/build/BUILD/ruby-3.0.2/vm.c:2172
#26 0xf7e0c3c9 in invoke_block (captured=<optimized out>, captured=<optimized out>, opt_pc=<optimized out>, type=<optimized out>, cref=0x0, self=1450588460, iseq=0x56691850, ec=0x5655afac)
    at /builddir/build/BUILD/ruby-3.0.2/vm_insnhelper.c:399
#27 invoke_iseq_block_from_c (me=0x0, is_lambda=<optimized out>, cref=0x0, passed_block_handler=0, kw_splat=0, argv=0xffffc2b0, argc=1, self=1450588460, captured=<optimized out>, ec=0x5655afac)
    at /builddir/build/BUILD/ruby-3.0.2/vm.c:1335
#28 invoke_block_from_c_bh (force_blockarg=<optimized out>, is_lambda=<optimized out>, cref=<optimized out>, passed_block_handler=<optimized out>, kw_splat=<optimized out>, argv=<optimized out>, 
    argc=<optimized out>, block_handler=<optimized out>, ec=<optimized out>) at /builddir/build/BUILD/ruby-3.0.2/vm.c:1353
#29 vm_yield (kw_splat=0, argv=0xffffc2b0, argc=1, ec=0x5655afac) at /builddir/build/BUILD/ruby-3.0.2/vm.c:1398
#30 rb_yield_0 (argv=0xffffc2b0, argc=1) at /builddir/build/BUILD/ruby-3.0.2/vm_eval.c:1333
#31 rb_yield (val=<optimized out>) at /builddir/build/BUILD/ruby-3.0.2/vm_eval.c:1349
#32 0xf7c2ac4a in rb_ary_each (ary=<optimized out>) at /builddir/build/BUILD/ruby-3.0.2/array.c:2523
#33 rb_ary_each (ary=1501058480) at /builddir/build/BUILD/ruby-3.0.2/array.c:2517
#34 0xf7dfc835 in vm_call_cfunc_with_frame (ec=0x5655afac, reg_cfp=0xf77f6dfc, calling=0xffffc474) at /builddir/build/BUILD/ruby-3.0.2/vm_insnhelper.c:2929
#35 0xf7e00602 in vm_call_method_each_type (ec=0x5655afac, cfp=0xf77f6dfc, calling=0xffffc474) at /builddir/build/BUILD/ruby-3.0.2/vm_insnhelper.c:3419
#36 0xf7e00a46 in vm_call_refined (calling=<optimized out>, cfp=0xf77f6dfc, ec=0x5655afac) at /builddir/build/BUILD/ruby-3.0.2/vm_insnhelper.c:3398
#37 vm_call_method_each_type (ec=0x5655afac, cfp=0xf77f6dfc, calling=<optimized out>) at /builddir/build/BUILD/ruby-3.0.2/vm_insnhelper.c:3476
#38 0xf7dfdd31 in vm_sendish (ec=0x5655afac, reg_cfp=0xf77f6dfc, cd=0x5669f510, block_handler=4152323593, method_explorer=mexp_search_method) at /builddir/build/BUILD/ruby-3.0.2/vm_callinfo.h:336
#39 0xf7e0590a in vm_exec_core (ec=0x0, initial=1448732852) at /builddir/build/BUILD/ruby-3.0.2/insns.def:770
#40 0xf7e1d0d5 in rb_vm_exec (ec=<optimized out>, mjit_enable_p=<optimized out>) at /builddir/build/BUILD/ruby-3.0.2/vm.c:2172
#41 0xf7e0c3c9 in invoke_block (captured=<optimized out>, captured=<optimized out>, opt_pc=<optimized out>, type=<optimized out>, cref=0x0, self=1450588460, iseq=0x566900cc, ec=0x5655afac)
    at /builddir/build/BUILD/ruby-3.0.2/vm_insnhelper.c:399
#42 invoke_iseq_block_from_c (me=0x0, is_lambda=<optimized out>, cref=0x0, passed_block_handler=0, kw_splat=0, argv=0xffffc720, argc=1, self=1450588460, captured=<optimized out>, ec=0x5655afac)
    at /builddir/build/BUILD/ruby-3.0.2/vm.c:1335
#43 invoke_block_from_c_bh (force_blockarg=<optimized out>, is_lambda=<optimized out>, cref=<optimized out>, passed_block_handler=<optimized out>, kw_splat=<optimized out>, argv=<optimized out>, 
    argc=<optimized out>, block_handler=<optimized out>, ec=<optimized out>) at /builddir/build/BUILD/ruby-3.0.2/vm.c:1353
#44 vm_yield (kw_splat=0, argv=0xffffc720, argc=1, ec=0x5655afac) at /builddir/build/BUILD/ruby-3.0.2/vm.c:1398
#45 rb_yield_0 (argv=0xffffc720, argc=1) at /builddir/build/BUILD/ruby-3.0.2/vm_eval.c:1333
#46 rb_yield (val=<optimized out>) at /builddir/build/BUILD/ruby-3.0.2/vm_eval.c:1349
#47 0xf7c2ac4a in rb_ary_each (ary=<optimized out>) at /builddir/build/BUILD/ruby-3.0.2/array.c:2523
--Type <RET> for more, q to quit, c to continue without paging--
#48 rb_ary_each (ary=1501058920) at /builddir/build/BUILD/ruby-3.0.2/array.c:2517
#49 0xf7dfc835 in vm_call_cfunc_with_frame (ec=0x5655afac, reg_cfp=0xf77f6ec0, calling=0xffffc8e4) at /builddir/build/BUILD/ruby-3.0.2/vm_insnhelper.c:2929
#50 0xf7e00602 in vm_call_method_each_type (ec=0x5655afac, cfp=0xf77f6ec0, calling=0xffffc8e4) at /builddir/build/BUILD/ruby-3.0.2/vm_insnhelper.c:3419
#51 0xf7e00a46 in vm_call_refined (calling=<optimized out>, cfp=0xf77f6ec0, ec=0x5655afac) at /builddir/build/BUILD/ruby-3.0.2/vm_insnhelper.c:3398
#52 vm_call_method_each_type (ec=0x5655afac, cfp=0xf77f6ec0, calling=<optimized out>) at /builddir/build/BUILD/ruby-3.0.2/vm_insnhelper.c:3476
#53 0xf7dfdd31 in vm_sendish (ec=0x5655afac, reg_cfp=0xf77f6ec0, cd=0x566cbca0, block_handler=4152323789, method_explorer=mexp_search_method) at /builddir/build/BUILD/ruby-3.0.2/vm_callinfo.h:336
#54 0xf7e0590a in vm_exec_core (ec=0x0, initial=1448732852) at /builddir/build/BUILD/ruby-3.0.2/insns.def:770
#55 0xf7e1d0d5 in rb_vm_exec (ec=<optimized out>, mjit_enable_p=<optimized out>) at /builddir/build/BUILD/ruby-3.0.2/vm.c:2172
#56 0xf7e1da4e in rb_iseq_eval (iseq=0x5657ad18) at /builddir/build/BUILD/ruby-3.0.2/vm.c:2409
#57 0xf7cdb23e in load_iseq_eval (ec=0x5655afac, fname=<optimized out>) at /builddir/build/BUILD/ruby-3.0.2/load.c:594
#58 0xf7ce0ef8 in require_internal (ec=<optimized out>, fname=<optimized out>, exception=<optimized out>) at /builddir/build/BUILD/ruby-3.0.2/load.c:1065
#59 0xf7ce10ce in rb_require_string (fname=1448587920) at /builddir/build/BUILD/ruby-3.0.2/load.c:1142
#60 0xf7ce117c in rb_f_require_relative (obj=1448845900, fname=1448588380) at /builddir/build/BUILD/ruby-3.0.2/load.c:857
#61 0xf7dfc835 in vm_call_cfunc_with_frame (ec=0x5655afac, reg_cfp=0xf77f6fd8, calling=0xffffce04) at /builddir/build/BUILD/ruby-3.0.2/vm_insnhelper.c:2929
#62 0xf7e00602 in vm_call_method_each_type (ec=0x5655afac, cfp=0xf77f6fd8, calling=0xffffce04) at /builddir/build/BUILD/ruby-3.0.2/vm_insnhelper.c:3419
#63 0xf7dfdd31 in vm_sendish (ec=0x5655afac, reg_cfp=0xf77f6fd8, cd=0x56616828, block_handler=0, method_explorer=mexp_search_method) at /builddir/build/BUILD/ruby-3.0.2/vm_callinfo.h:336
#64 0xf7e04d92 in vm_exec_core (ec=0x0, initial=1448732852) at /builddir/build/BUILD/ruby-3.0.2/insns.def:789
#65 0xf7e1d0d5 in rb_vm_exec (ec=<optimized out>, mjit_enable_p=<optimized out>) at /builddir/build/BUILD/ruby-3.0.2/vm.c:2172
#66 0xf7e1db19 in rb_iseq_eval_main (iseq=0x5657b63c) at /builddir/build/BUILD/ruby-3.0.2/vm.c:2420
#67 0xf7c91b99 in rb_ec_exec_node (ec=ec@entry=0x5655afac, n=n@entry=0x5657b63c) at /builddir/build/BUILD/ruby-3.0.2/eval.c:317
#68 0xf7c964fa in ruby_run_node (n=0x5657b63c) at /builddir/build/BUILD/ruby-3.0.2/eval.c:375
#69 0x56556143 in main (argc=<optimized out>, argv=<optimized out>) at ./main.c:50

Unfortunately:

  1. I don' have better reproducer then to run the whole test suite and even then it is not triggered always. I was not successful to hit the issue running just the single test case or the test file.
  2. I have failed to reproduce this on CentOS Stream 9, which is surprising.

Luckily, I can reproduce it on my system.

This is seems to be related to LTO, because I have never faced such issue with LTO disabled.


Files

mmap.patch (9.45 KB) mmap.patch peterzhu2118 (Peter Zhu), 12/14/2021 08:39 PM
mmap.patch (11.1 KB) mmap.patch peterzhu2118 (Peter Zhu), 12/15/2021 02:36 PM

Related issues 1 (0 open1 closed)

Related to Ruby master - Bug #18746: /TestGCCompact#test_(ast_compacts|compact_count|complex_hash_keys|gc_compact_stats)/ fails on PPCClosedActions

Updated by peterzhu2118 (Peter Zhu) over 1 year ago

The backtrace looks similar to #18119 which is triggered in Ractor.

Updated by vo.x (Vit Ondruch) about 1 year ago

Not sure if I was previously lucky on Fedora, but trying to update to Ruby 3.0.3, from 10 builds I have made 8 failed due to this issue.

Updated by peterzhu2118 (Peter Zhu) about 1 year ago

  • Status changed from Open to Closed

Hi, I was able to debug a core dump for this bug. Backports in #18394 should fix it. Thanks for the bug report!

Updated by vo.x (Vit Ondruch) 12 months ago

  • Status changed from Closed to Assigned

Thanks for looking into this. However, applying these two patches, while fixing i686, it breaks ppc64le :(

[ 8890/21266] TestGCCompact#test_ast_compacts<internal:gc>:213: [BUG] Couldn't unprotect page 0x0000000140f98000
ruby 3.0.3p157 (2021-11-24 revision 3fb7d2cadc) [powerpc64le-linux]
-- Control frame information -----------------------------------------------
c:0031 p:0003 s:0175 e:000174 METHOD <internal:gc>:213
c:0030 p:0031 s:0171 e:000169 METHOD /builddir/build/BUILD/ruby-3.0.3/test/ruby/test_gc_compact.rb:146
c:0029 p:0052 s:0165 e:000164 METHOD /builddir/build/BUILD/ruby-3.0.3/tool/lib/test/unit.rb:1283
c:0028 p:0065 s:0159 e:000158 METHOD /builddir/build/BUILD/ruby-3.0.3/tool/lib/minitest/unit.rb:1330
c:0027 p:0013 s:0150 e:000149 METHOD /builddir/build/BUILD/ruby-3.0.3/tool/lib/test/unit/testcase.rb:18
c:0026 p:0077 s:0145 e:000144 BLOCK  /builddir/build/BUILD/ruby-3.0.3/tool/lib/minitest/unit.rb:979 [FINISH]
c:0025 p:---- s:0138 e:000137 CFUNC  :map
c:0024 p:0006 s:0134 e:000133 BLOCK  /builddir/build/BUILD/ruby-3.0.3/tool/lib/minitest/unit.rb:972
c:0023 p:0186 s:0130 E:001920 METHOD /builddir/build/BUILD/ruby-3.0.3/tool/lib/minitest/unit.rb:999
c:0022 p:0042 s:0118 e:000117 METHOD /builddir/build/BUILD/ruby-3.0.3/tool/lib/test/unit.rb:1136
c:0021 p:0010 s:0111 e:000109 BLOCK  /builddir/build/BUILD/ruby-3.0.3/tool/lib/test/unit.rb:627 [FINISH]
c:0020 p:---- s:0105 e:000104 CFUNC  :each
c:0019 p:0054 s:0101 E:000198 METHOD /builddir/build/BUILD/ruby-3.0.3/tool/lib/test/unit.rb:625
c:0018 p:0008 s:0094 E:001178 METHOD /builddir/build/BUILD/ruby-3.0.3/tool/lib/test/unit.rb:662
c:0017 p:0140 s:0087 E:0000b8 METHOD /builddir/build/BUILD/ruby-3.0.3/tool/lib/minitest/unit.rb:908
c:0016 p:0016 s:0074 E:001018 METHOD /builddir/build/BUILD/ruby-3.0.3/tool/lib/test/unit.rb:1073
c:0015 p:0005 s:0069 E:000910 METHOD /builddir/build/BUILD/ruby-3.0.3/tool/lib/minitest/unit.rb:1147
c:0014 p:0006 s:0065 E:001a48 BLOCK  /builddir/build/BUILD/ruby-3.0.3/tool/lib/minitest/unit.rb:1134 [FINISH]
c:0013 p:---- s:0061 e:000060 CFUNC  :each
c:0012 p:0047 s:0057 E:0015b8 METHOD /builddir/build/BUILD/ruby-3.0.3/tool/lib/minitest/unit.rb:1133
c:0011 p:0013 s:0052 E:001af8 METHOD /builddir/build/BUILD/ruby-3.0.3/tool/lib/minitest/unit.rb:1121
c:0010 p:0008 s:0047 E:001750 METHOD /builddir/build/BUILD/ruby-3.0.3/tool/lib/test/unit.rb:847
c:0009 p:0008 s:0041 E:0024a0 METHOD /builddir/build/BUILD/ruby-3.0.3/tool/lib/test/unit.rb:695
c:0008 p:0015 s:0035 E:0019f8 METHOD /builddir/build/BUILD/ruby-3.0.3/tool/lib/test/unit.rb:34
c:0007 p:0006 s:0030 E:0010d8 METHOD /builddir/build/BUILD/ruby-3.0.3/tool/lib/test/unit.rb:1175
c:0006 p:0032 s:0025 E:000580 METHOD /builddir/build/BUILD/ruby-3.0.3/tool/lib/test/unit.rb:1245
c:0005 p:0009 s:0021 E:000098 METHOD /builddir/build/BUILD/ruby-3.0.3/tool/lib/test/unit.rb:1249
c:0004 p:0172 s:0016 E:001be8 TOP    /builddir/build/BUILD/ruby-3.0.3/tool/test/runner.rb:23 [FINISH]
c:0003 p:---- s:0011 e:000010 CFUNC  :require_relative
c:0002 p:0092 s:0006 E:000ff0 EVAL   ./test/runner.rb:11 [FINISH]
c:0001 p:0000 s:0003 E:001730 (none) [FINISH]
-- Ruby level backtrace information ----------------------------------------
./test/runner.rb:11:in `<main>'
./test/runner.rb:11:in `require_relative'
/builddir/build/BUILD/ruby-3.0.3/tool/test/runner.rb:23:in `<top (required)>'
/builddir/build/BUILD/ruby-3.0.3/tool/lib/test/unit.rb:1249:in `run'
/builddir/build/BUILD/ruby-3.0.3/tool/lib/test/unit.rb:1245:in `run'
/builddir/build/BUILD/ruby-3.0.3/tool/lib/test/unit.rb:1175:in `run'
/builddir/build/BUILD/ruby-3.0.3/tool/lib/test/unit.rb:34:in `run'
/builddir/build/BUILD/ruby-3.0.3/tool/lib/test/unit.rb:695:in `run'
/builddir/build/BUILD/ruby-3.0.3/tool/lib/test/unit.rb:847:in `run'
/builddir/build/BUILD/ruby-3.0.3/tool/lib/minitest/unit.rb:1121:in `run'
/builddir/build/BUILD/ruby-3.0.3/tool/lib/minitest/unit.rb:1133:in `_run'
/builddir/build/BUILD/ruby-3.0.3/tool/lib/minitest/unit.rb:1133:in `each'
/builddir/build/BUILD/ruby-3.0.3/tool/lib/minitest/unit.rb:1134:in `block in _run'
/builddir/build/BUILD/ruby-3.0.3/tool/lib/minitest/unit.rb:1147:in `run_tests'
/builddir/build/BUILD/ruby-3.0.3/tool/lib/test/unit.rb:1073:in `_run_anything'
/builddir/build/BUILD/ruby-3.0.3/tool/lib/minitest/unit.rb:908:in `_run_anything'
/builddir/build/BUILD/ruby-3.0.3/tool/lib/test/unit.rb:662:in `_run_suites'
/builddir/build/BUILD/ruby-3.0.3/tool/lib/test/unit.rb:625:in `_run_suites'
/builddir/build/BUILD/ruby-3.0.3/tool/lib/test/unit.rb:625:in `each'
/builddir/build/BUILD/ruby-3.0.3/tool/lib/test/unit.rb:627:in `block in _run_suites'
/builddir/build/BUILD/ruby-3.0.3/tool/lib/test/unit.rb:1136:in `_run_suite'
/builddir/build/BUILD/ruby-3.0.3/tool/lib/minitest/unit.rb:999:in `_run_suite'
/builddir/build/BUILD/ruby-3.0.3/tool/lib/minitest/unit.rb:972:in `block in _run_suite'
/builddir/build/BUILD/ruby-3.0.3/tool/lib/minitest/unit.rb:972:in `map'
/builddir/build/BUILD/ruby-3.0.3/tool/lib/minitest/unit.rb:979:in `block (2 levels) in _run_suite'
/builddir/build/BUILD/ruby-3.0.3/tool/lib/test/unit/testcase.rb:18:in `run'
/builddir/build/BUILD/ruby-3.0.3/tool/lib/minitest/unit.rb:1330:in `run'
/builddir/build/BUILD/ruby-3.0.3/tool/lib/test/unit.rb:1283:in `run_test'
/builddir/build/BUILD/ruby-3.0.3/test/ruby/test_gc_compact.rb:146:in `test_ast_compacts'
<internal:gc>:213:in `compact'
-- C level backtrace information -------------------------------------------
/builddir/build/BUILD/ruby-3.0.3/libruby.so.3.0.3(rb_print_backtrace+0x24) [0x7fff89f83424] vm_dump.c:758
/builddir/build/BUILD/ruby-3.0.3/libruby.so.3.0.3(rb_vm_bugreport.constprop.0+0x5c0) [0x7fff89fa1520] vm_dump.c:998
/builddir/build/BUILD/ruby-3.0.3/libruby.so.3.0.3(rb_bug+0xa4) [0x7fff89cdbed8] error.c:763
/builddir/build/BUILD/ruby-3.0.3/libruby.so.3.0.3(gc_sweep_step.isra.0+0x1c60) [0x7fff89fb15f0] gc.c:4505
/builddir/build/BUILD/ruby-3.0.3/libruby.so.3.0.3(gc_sweep.lto_priv.0+0x13c) [0x7fff89d8fb0c] gc.c:5153
/builddir/build/BUILD/ruby-3.0.3/libruby.so.3.0.3(gc_start.lto_priv.0+0xb10) [0x7fff89d9ba40] gc.c:7465
/builddir/build/BUILD/ruby-3.0.3/libruby.so.3.0.3(garbage_collect.lto_priv.0+0x60) [0x7fff89d9bea0] gc.c:8202
/builddir/build/BUILD/ruby-3.0.3/libruby.so.3.0.3(gc_compact.lto_priv.0+0x48) [0x7fff89d9c528] gc.c:8545
/builddir/build/BUILD/ruby-3.0.3/libruby.so.3.0.3(builtin_invoker0.lto_priv.0+0x24) [0x7fff89f65444] vm_insnhelper.c:5445
/builddir/build/BUILD/ruby-3.0.3/libruby.so.3.0.3(vm_exec_core.lto_priv.0+0x24f8) [0x7fff89f6fc38] insns.def:1482
/builddir/build/BUILD/ruby-3.0.3/libruby.so.3.0.3(rb_vm_exec+0x130) [0x7fff89f8c580] vm.c:2172
/builddir/build/BUILD/ruby-3.0.3/libruby.so.3.0.3(rb_yield+0x2f8) [0x7fff89f774c8] vm.c:1263
/builddir/build/BUILD/ruby-3.0.3/libruby.so.3.0.3(rb_ary_collect.lto_priv.0+0x74) [0x7fff89ce4944] array.c:3635
/builddir/build/BUILD/ruby-3.0.3/libruby.so.3.0.3(ractor_safe_call_cfunc_0.lto_priv.0+0x24) [0x7fff89f57724] vm_insnhelper.c:2748
/builddir/build/BUILD/ruby-3.0.3/libruby.so.3.0.3(vm_call_cfunc_with_frame+0x150) [0x7fff89f62470] vm_insnhelper.c:2931
/builddir/build/BUILD/ruby-3.0.3/libruby.so.3.0.3(vm_sendish.lto_priv.0+0x3dc) [0x7fff89f6792c] vm_insnhelper.c:4532
/builddir/build/BUILD/ruby-3.0.3/libruby.so.3.0.3(vm_exec_core.lto_priv.0+0x1890) [0x7fff89f6efd0] insns.def:770
/builddir/build/BUILD/ruby-3.0.3/libruby.so.3.0.3(rb_vm_exec+0x130) [0x7fff89f8c580] vm.c:2172
/builddir/build/BUILD/ruby-3.0.3/libruby.so.3.0.3(rb_yield+0x2f8) [0x7fff89f774c8] vm.c:1263
/builddir/build/BUILD/ruby-3.0.3/libruby.so.3.0.3(rb_ary_each+0x54) [0x7fff89ce45b4] array.c:2523
/builddir/build/BUILD/ruby-3.0.3/libruby.so.3.0.3(ractor_safe_call_cfunc_0.lto_priv.0+0x24) [0x7fff89f57724] vm_insnhelper.c:2748
/builddir/build/BUILD/ruby-3.0.3/libruby.so.3.0.3(vm_call_cfunc_with_frame+0x150) [0x7fff89f62470] vm_insnhelper.c:2931
/builddir/build/BUILD/ruby-3.0.3/libruby.so.3.0.3(vm_call_method_each_type+0x6e8) [0x7fff89f632a8] vm_insnhelper.c:3400
/builddir/build/BUILD/ruby-3.0.3/libruby.so.3.0.3(vm_sendish.lto_priv.0+0x3dc) [0x7fff89f6792c] vm_insnhelper.c:4532
/builddir/build/BUILD/ruby-3.0.3/libruby.so.3.0.3(vm_exec_core.lto_priv.0+0x1890) [0x7fff89f6efd0] insns.def:770
/builddir/build/BUILD/ruby-3.0.3/libruby.so.3.0.3(rb_vm_exec+0x130) [0x7fff89f8c580] vm.c:2172
/builddir/build/BUILD/ruby-3.0.3/libruby.so.3.0.3(rb_yield+0x2f8) [0x7fff89f774c8] vm.c:1263
/builddir/build/BUILD/ruby-3.0.3/libruby.so.3.0.3(rb_ary_each+0x54) [0x7fff89ce45b4] array.c:2523
/builddir/build/BUILD/ruby-3.0.3/libruby.so.3.0.3(ractor_safe_call_cfunc_0.lto_priv.0+0x24) [0x7fff89f57724] vm_insnhelper.c:2748
/builddir/build/BUILD/ruby-3.0.3/libruby.so.3.0.3(vm_call_cfunc_with_frame+0x150) [0x7fff89f62470] vm_insnhelper.c:2931
/builddir/build/BUILD/ruby-3.0.3/libruby.so.3.0.3(vm_call_method_each_type+0x6e8) [0x7fff89f632a8] vm_insnhelper.c:3400
/builddir/build/BUILD/ruby-3.0.3/libruby.so.3.0.3(vm_sendish.lto_priv.0+0x3dc) [0x7fff89f6792c] vm_insnhelper.c:4532
/builddir/build/BUILD/ruby-3.0.3/libruby.so.3.0.3(vm_exec_core.lto_priv.0+0x1890) [0x7fff89f6efd0] insns.def:770
/builddir/build/BUILD/ruby-3.0.3/libruby.so.3.0.3(rb_vm_exec+0x130) [0x7fff89f8c580] vm.c:2172
/builddir/build/BUILD/ruby-3.0.3/libruby.so.3.0.3(rb_iseq_eval+0x158) [0x7fff89f8da88] vm.c:2409
/builddir/build/BUILD/ruby-3.0.3/libruby.so.3.0.3(load_iseq_eval+0x1a0) [0x7fff89dda870] load.c:638
/builddir/build/BUILD/ruby-3.0.3/libruby.so.3.0.3(require_internal.lto_priv.0+0xa2c) [0x7fff89ddc93c] load.c:1109
/builddir/build/BUILD/ruby-3.0.3/libruby.so.3.0.3(rb_require_string+0x44) [0x7fff89ddcb44] load.c:1186
/builddir/build/BUILD/ruby-3.0.3/libruby.so.3.0.3(rb_f_require_relative+0x78) [0x7fff89ddcc88] load.c:901
/builddir/build/BUILD/ruby-3.0.3/libruby.so.3.0.3(ractor_safe_call_cfunc_1.lto_priv.0+0x28) [0x7fff89f57778] vm_insnhelper.c:2755
/builddir/build/BUILD/ruby-3.0.3/libruby.so.3.0.3(vm_call_cfunc_with_frame+0x150) [0x7fff89f62470] vm_insnhelper.c:2931
/builddir/build/BUILD/ruby-3.0.3/libruby.so.3.0.3(vm_sendish.lto_priv.0+0x3dc) [0x7fff89f6792c] vm_insnhelper.c:4532
/builddir/build/BUILD/ruby-3.0.3/libruby.so.3.0.3(vm_exec_core.lto_priv.0+0x16c) [0x7fff89f6d8ac] insns.def:789
/builddir/build/BUILD/ruby-3.0.3/libruby.so.3.0.3(rb_vm_exec+0x130) [0x7fff89f8c580] vm.c:2172
/builddir/build/BUILD/ruby-3.0.3/libruby.so.3.0.3(rb_iseq_eval_main+0xf0) [0x7fff89f8dbd0] vm.c:2420
/builddir/build/BUILD/ruby-3.0.3/libruby.so.3.0.3(rb_ec_exec_node+0xb0) [0x7fff89d73c30] eval.c:317
/builddir/build/BUILD/ruby-3.0.3/libruby.so.3.0.3(ruby_run_node+0x7c) [0x7fff89d73d8c] eval.c:375
/builddir/build/BUILD/ruby-3.0.3/ruby(main+0x78) [0x10ca50228] ./main.c:50

However, it is probably fixed in master, because I have not hit this issue while testing Ruby 3.1.0.

For the time being, the build is available here and this is the build.log

Updated by vo.x (Vit Ondruch) 12 months ago

vo.x (Vit Ondruch) wrote in #note-5:

Thanks for looking into this. However, applying these two patches, while fixing i686, it breaks ppc64le :(

And sometimes aarch64

Updated by peterzhu2118 (Peter Zhu) 12 months ago

Hey @vo.x (Vit Ondruch), can you check if also backporting this PR fixes the crashes? https://github.com/ruby/ruby/pull/4227

Ruby 3.0 is still using posix_memalign to allocate pages. Only memory allocated with mmap is allowed to be passed into mprotect.

Updated by vo.x (Vit Ondruch) 12 months ago

Unfortunately, the build fails already in miniruby:

... snip ...

gcc -O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -mcpu=power8 -mtune=power8 -fasynchronous-unwind-tables -fstack-clash-protection -fPIC -m64 -L. -Wl,-z,relro -Wl,--as-needed  -Wl,-z,now -specs=/usr/lib/rpm/redhat/redhat-hardened-ld -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -fstack-protector-strong -rdynamic -Wl,-export-dynamic -fstack-protector-strong  main.o dmydln.o miniinit.o dmyext.o abrt.o array.o ast.o bignum.o class.o compar.o compile.o complex.o cont.o debug.o debug_counter.o dir.o dln_find.o encoding.o enum.o enumerator.o error.o eval.o file.o gc.o hash.o inits.o io.o iseq.o load.o marshal.o math.o memory_view.o mjit.o mjit_compile.o node.o numeric.o object.o pack.o parse.o proc.o process.o ractor.o random.o range.o rational.o re.o regcomp.o regenc.o regerror.o regexec.o regparse.o regsyntax.o ruby.o scheduler.o signal.o sprintf.o st.o strftime.o string.o struct.o symbol.o thread.o time.o transcode.o transient_heap.o util.o variable.o version.o vm.o vm_backtrace.o vm_dump.o vm_sync.o vm_trace.o coroutine/ppc64le/Context.o probes.o enc/ascii.o enc/us_ascii.o enc/unicode.o enc/utf_8.o enc/trans/newline.o setproctitle.o strlcat.o strlcpy.o addr2line.o  -lz -lpthread -lrt -lrt -lgmp -ldl -lcrypt -lm  -lm   -o miniruby
:
gcc -E -DRUBY_EXPORT -I. -I.ext/include/powerpc64le-linux -I./include -I. -I./enc/unicode/12.1.0    "./version.c" | \
./miniruby -I./lib -I. -I.ext/common  "./tool/generic_erb.rb" -o powerpc64le-linux-fake.rb "./template/fake.rb.in" \
	i=- srcdir="." BASERUBY="echo executable host ruby is required.  use --with-baseruby option.; false"
make: *** [uncommon.mk:755: powerpc64le-linux-fake.rb] Error 139
./miniruby -I./lib -I. -I.ext/common  ./tool/generic_erb.rb -c -o encdb.h ./template/encdb.h.tmpl ./enc enc
make: *** Waiting for unfinished jobs....
make: *** [uncommon.mk:1096: encdb.h] Segmentation fault (core dumped)
./miniruby -I./lib -I. -I.ext/common  -n \
-e 'BEGIN{version=ARGV.shift;mis=ARGV.dup}' \
-e 'END{abort "UNICODE version mismatch: #{mis}" unless mis.empty?}' \
-e '(mis.delete(ARGF.path); ARGF.close) if /ONIG_UNICODE_VERSION_STRING +"#{Regexp.quote(version)}"/o' \
12.1.0 ./enc/unicode/12.1.0/casefold.h ./enc/unicode/12.1.0/name2ctype.h 
make: *** [uncommon.mk:820: .rbconfig.time] Segmentation fault (core dumped)
./miniruby -I./lib -I. -I.ext/common  ./tool/generic_erb.rb -o builtin_binary.inc \
	./template/builtin_binary.inc.tmpl -- --cross=no
make: *** [uncommon.mk:1145: builtin_binary.inc] Segmentation fault (core dumped)

This is the build and full build.log

Updated by peterzhu2118 (Peter Zhu) 12 months ago

I don't have access to a ppc64 machine. Do you know what the crash is?

Updated by vo.x (Vit Ondruch) 12 months ago

$ gdb --args ./miniruby -I./lib -I. -I.ext/common  ./tool/generic_erb.rb -o builtin_binary.inc      ./template/builtin_binary.inc.tmpl -- --cross=no
GNU gdb (GDB) Fedora 11.1-6.fc36
Copyright (C) 2021 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "ppc64le-redhat-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ./miniruby...
warning: File "/builddir/build/BUILD/ruby-3.0.3/.gdbinit" auto-loading has been declined by your `auto-load safe-path' set to "$debugdir:$datadir/auto-load".
To enable execution of this file add
	add-auto-load-safe-path /builddir/build/BUILD/ruby-3.0.3/.gdbinit
line to your configuration file "/builddir/.config/gdb/gdbinit".
To completely disable this security protection add
	set auto-load safe-path /
line to your configuration file "/builddir/.config/gdb/gdbinit".
For more information about this security protection see the
"Auto-loading safe path" section in the GDB manual.  E.g., run from the shell:
	info "(gdb)Auto-loading safe path"
(gdb) r
Starting program: /builddir/build/BUILD/ruby-3.0.3/miniruby -I./lib -I. -I.ext/common ./tool/generic_erb.rb -o builtin_binary.inc ./template/builtin_binary.inc.tmpl -- --cross=no
Download failed: No route to host.  Continuing without debug info for /lib64/libz.so.1.
Download failed: No route to host.  Continuing without debug info for /lib64/libgmp.so.10.
Download failed: No route to host.  Continuing without debug info for /lib64/libcrypt.so.2.
Download failed: No route to host.  Continuing without debug info for /lib64/libm.so.6.
Download failed: No route to host.  Continuing without debug info for /lib64/libc.so.6.
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".

Program received signal SIGSEGV, Segmentation fault.
heap_page_allocate (objspace=0x1004b1400) at gc.c:1870
1870	    page_body->header.page = page;
Missing separate debuginfos, use: dnf debuginfo-install glibc-2.34.9000-26.fc36.ppc64le gmp-6.2.1-1.fc36.ppc64le libxcrypt-4.4.26-4.fc36.ppc64le zlib-1.2.11-30.fc35.ppc64le
(gdb) bt
#0  heap_page_allocate (objspace=0x1004b1400) at gc.c:1870
#1  heap_page_create (objspace=0x1004b1400) at gc.c:1910
#2  heap_assign_page (objspace=0x1004b1400, heap=0x1004b1428) at gc.c:1935
#3  0x00000001000df220 in heap_add_pages (add=24, heap=0x1004b1428, objspace=0x1004b1400) at gc.c:1948
#4  Init_heap () at gc.c:3173
#5  ruby_setup () at eval.c:87
#6  0x00000001000e50e8 in ruby_init () at eval.c:110
#7  0x0000000100032fa0 in main (argc=<optimized out>, argv=<optimized out>) at ./main.c:49
(gdb) l heap_page_allocate
1801	    struct heap_page_body *page_body = 0;
1802	    size_t hi, lo, mid;
1803	    int limit = HEAP_PAGE_OBJ_LIMIT;
1804	
1805	    /* assign heap_page body (contains heap_page_header and RVALUEs) */
1806	    page_body = (struct heap_page_body *)rb_aligned_malloc(HEAP_PAGE_ALIGN, HEAP_PAGE_SIZE);
1807	    if (page_body == 0) {
1808		rb_memerror();
1809	    }
1810	
(gdb) 
1811	    /* assign heap_page entry */
1812	    page = calloc1(sizeof(struct heap_page));
1813	    if (page == 0) {
1814	        rb_aligned_free(page_body, HEAP_PAGE_SIZE);
1815		rb_memerror();
1816	    }
1817	
1818	    /* adjust obj_limit (object number available in this page) */
1819	    start = (RVALUE*)((VALUE)page_body + sizeof(struct heap_page_header));
1820	    if ((VALUE)start % sizeof(RVALUE) != 0) {
(gdb) 
1821		int delta = (int)(sizeof(RVALUE) - ((VALUE)start % sizeof(RVALUE)));
1822		start = (RVALUE*)((VALUE)start + delta);
1823		limit = (HEAP_PAGE_SIZE - (int)((VALUE)start - (VALUE)page_body))/(int)sizeof(RVALUE);
1824	    }
1825	    end = start + limit;
1826	
1827	    /* setup heap_pages_sorted */
1828	    lo = 0;
1829	    hi = heap_allocated_pages;
1830	    while (lo < hi) {
(gdb) 
1831		struct heap_page *mid_page;
1832	
1833		mid = (lo + hi) / 2;
1834		mid_page = heap_pages_sorted[mid];
1835		if (mid_page->start < start) {
1836		    lo = mid + 1;
1837		}
1838		else if (mid_page->start > start) {
1839		    hi = mid;
1840		}
(gdb) 
1841		else {
1842		    rb_bug("same heap page is allocated: %p at %"PRIuVALUE, (void *)page_body, (VALUE)mid);
1843		}
1844	    }
1845	
1846	    if (hi < heap_allocated_pages) {
1847		MEMMOVE(&heap_pages_sorted[hi+1], &heap_pages_sorted[hi], struct heap_page_header*, heap_allocated_pages - hi);
1848	    }
1849	
1850	    heap_pages_sorted[hi] = page;
(gdb) 
1851	
1852	    heap_allocated_pages++;
1853	
1854	    GC_ASSERT(heap_eden->total_pages + heap_allocatable_pages <= heap_pages_sorted_length);
1855	    GC_ASSERT(heap_eden->total_pages + heap_tomb->total_pages == heap_allocated_pages - 1);
1856	    GC_ASSERT(heap_allocated_pages <= heap_pages_sorted_length);
1857	
1858	    objspace->profile.total_allocated_pages++;
1859	
1860	    if (heap_allocated_pages > heap_pages_sorted_length) {
(gdb) debug2: channel 0: window 999361 sent adjust 49215

1861		rb_bug("heap_page_allocate: allocated(%"PRIdSIZE") > sorted(%"PRIdSIZE")",
1862		       heap_allocated_pages, heap_pages_sorted_length);
1863	    }
1864	
1865	    if (heap_pages_lomem == 0 || heap_pages_lomem > start) heap_pages_lomem = start;
1866	    if (heap_pages_himem < end) heap_pages_himem = end;
1867	
1868	    page->start = start;
1869	    page->total_slots = limit;
1870	    page_body->header.page = page;
(gdb) 
1871	
1872	    for (p = start; p != end; p++) {
1873		gc_report(3, objspace, "assign_heap_page: %p is added to freelist\n", (void *)p);
1874		heap_page_add_freeobj(objspace, page, (VALUE)p);
1875	    }
1876	    page->free_slots = limit;
1877	
1878	    asan_poison_memory_region(&page->freelist, sizeof(RVALUE*));
1879	    return page;
1880	}
(gdb) 

Updated by vo.x (Vit Ondruch) 12 months ago

It seems that the rb_aligned_malloc already returns inaccessible pointer:

Breakpoint 1, heap_page_allocate (objspace=0x1004b1400) at gc.c:1806
1806	    page_body = (struct heap_page_body *)rb_aligned_malloc(HEAP_PAGE_ALIGN, HEAP_PAGE_SIZE);
(gdb) p page_body
$3 = (struct heap_page_body *) 0x0
(gdb) n
1807	    if (page_body == 0) {
(gdb) p page_body
$4 = (struct heap_page_body *) 0x7ffff7844000
(gdb) p *page_body
Cannot access memory at address 0x7ffff7844000

Updated by vo.x (Vit Ondruch) 12 months ago

Heare I am stepping through the rb_aligned_malloc

(gdb) r
The program being debugged has been started already.
Start it from the beginning? (y or n) y
Starting program: /builddir/build/BUILD/ruby-3.0.3/miniruby 
Download failed: No route to host.  Continuing without debug info for /lib64/libz.so.1.
Download failed: No route to host.  Continuing without debug info for /lib64/libgmp.so.10.
Download failed: No route to host.  Continuing without debug info for /lib64/libcrypt.so.2.
Download failed: No route to host.  Continuing without debug info for /lib64/libm.so.6.
Download failed: No route to host.  Continuing without debug info for /lib64/libc.so.6.
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".

Breakpoint 1, heap_page_allocate (objspace=0x1004b1400) at gc.c:1806
1806	    page_body = (struct heap_page_body *)rb_aligned_malloc(HEAP_PAGE_ALIGN, HEAP_PAGE_SIZE);
(gdb) c
Continuing.

Breakpoint 2, 0x0000000100109b48 in rb_aligned_malloc (alignment=16384, size=16384) at gc.c:10354
10354	{
(gdb) l
10349	    EC_JUMP_TAG(ec, TAG_RAISE);
10350	}
10351	
10352	void *
10353	rb_aligned_malloc(size_t alignment, size_t size)
10354	{
10355	    void *res;
10356	
10357	#if defined __MINGW32__
10358	    res = __mingw_aligned_malloc(size, alignment);
(gdb) 
10359	#elif defined _WIN32
10360	    void *_aligned_malloc(size_t, size_t);
10361	    res = _aligned_malloc(size, alignment);
10362	#elif defined(HAVE_MMAP)
10363	    GC_ASSERT(alignment % sysconf(_SC_PAGE_SIZE) == 0);
10364	
10365	    char *ptr = mmap(NULL, alignment + size, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
10366	    if (ptr == MAP_FAILED) {
10367	        return NULL;
10368	    }
(gdb) 
10369	
10370	    char *aligned = ptr + alignment;
10371	    aligned -= ((VALUE)aligned & (alignment - 1));
10372	    GC_ASSERT(aligned > ptr);
10373	    GC_ASSERT(aligned <= ptr + alignment);
10374	
10375	    size_t start_out_of_range_size = aligned - ptr;
10376	    GC_ASSERT(start_out_of_range_size % sysconf(_SC_PAGE_SIZE) == 0);
10377	    if (start_out_of_range_size > 0) {
10378	        if (munmap(ptr, start_out_of_range_size)) {
(gdb) 
10379	            rb_bug("rb_aligned_malloc: munmap faile for start");
10380	        }
10381	    }
10382	
10383	    size_t end_out_of_range_size = alignment - start_out_of_range_size;
10384	    GC_ASSERT(end_out_of_range_size % sysconf(_SC_PAGE_SIZE) == 0);
10385	    if (end_out_of_range_size > 0) {
10386	        if (munmap(aligned + size, end_out_of_range_size)) {
10387	            rb_bug("rb_aligned_malloc: munmap failed for end");
10388	        }
(gdb) 
10389	    }
10390	
10391	    res = (void *)aligned;
10392	#else
10393	    char* aligned;
10394	    res = malloc(alignment + size + sizeof(void*));
10395	    aligned = (char*)res + alignment + sizeof(void*);
10396	    aligned -= ((VALUE)aligned & (alignment - 1));
10397	    ((void**)aligned)[-1] = res;
10398	    res = (void*)aligned;
(gdb) debug2: channel 0: window 999258 sent adjust 49318

10399	#endif
10400	
10401	    /* alignment must be a power of 2 */
10402	    GC_ASSERT(((alignment - 1) & alignment) == 0);
10403	    GC_ASSERT(alignment % sizeof(void*) == 0);
10404	    return res;
10405	}
10406	
10407	static void
10408	rb_aligned_free(void *ptr, size_t size)
(gdb) n
10365	    char *ptr = mmap(NULL, alignment + size, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
(gdb) 
10366	    if (ptr == MAP_FAILED) {
(gdb) p ptr
$18 = 0x7ffff7840000 ""
(gdb) n
10370	    char *aligned = ptr + alignment;
(gdb) p alignment
$19 = 16384
(gdb) n
10371	    aligned -= ((VALUE)aligned & (alignment - 1));
(gdb) p aligned
$20 = 0x7ffff7844000 ""
(gdb) n
10377	    if (start_out_of_range_size > 0) {
(gdb) p aligned
$21 = 0x7ffff7844000 ""
(gdb) p start_out_of_range_size 
$22 = 16384
(gdb) n
10378	        if (munmap(ptr, start_out_of_range_size)) {
(gdb) p ptr
$23 = 0x7ffff7840000 ""
(gdb) 
$24 = 0x7ffff7840000 ""
(gdb) n
10385	    if (end_out_of_range_size > 0) {
(gdb) p ptr
$25 = <optimized out>
(gdb) p aligned
$26 = 0x7ffff7844000 <error: Cannot access memory at address 0x7ffff7844000>
(gdb) p end_out_of_range_size 
$27 = 0

And the originally mapped block is later unmapped, therefore the page_body is not accessible. I have also apply the git|0bd1bc559f7a904e7fb64d41b98a9c27ddec7298 but without much success.

Updated by peterzhu2118 (Peter Zhu) 12 months ago

Hmmm, that's really odd. I think I can get access to the ppc64 machine on rubyci.org. I'll try to debug this next week.

Updated by vo.x (Vit Ondruch) 12 months ago

peterzhu2118 (Peter Zhu) wrote in #note-13:

Hmmm, that's really odd. I think I can get access to the ppc64 machine on rubyci.org. I'll try to debug this next week.

If you like, you could ping @sharkcz (Dan HorĂ¡k) on #fedora-ppc at libera.chat IRC for PPC shell access:

https://fedoraproject.org/wiki/Architectures/PowerPC

Updated by vo.x (Vit Ondruch) 12 months ago

vo.x (Vit Ondruch) wrote in #note-5:

Thanks for looking into this. However, applying these two patches, while fixing i686, it breaks ppc64le :(

[ 8890/21266] TestGCCompact#test_ast_compacts<internal:gc>:213: [BUG] Couldn't unprotect page 0x0000000140f98000
ruby 3.0.3p157 (2021-11-24 revision 3fb7d2cadc) [powerpc64le-linux]
-- Control frame information -----------------------------------------------

Since I have the PPC at hand, here is the full backtrace:

$ make gdb-ruby TESTRUN_SCRIPT=test/ruby/test_gc_compact.rb RUNOPT0='-I.ext/powerpc64le-linux:tool/lib'
./revision.h unchanged
Reading symbols from /builddir/build/BUILD/ruby-3.0.3/ruby...
warning: File "/builddir/build/BUILD/ruby-3.0.3/.gdbinit" auto-loading has been declined by your `auto-load safe-path' set to "$debugdir:$datadir/auto-load".
To enable execution of this file add
	add-auto-load-safe-path /builddir/build/BUILD/ruby-3.0.3/.gdbinit
line to your configuration file "/builddir/.config/gdb/gdbinit".
To completely disable this security protection add
	set auto-load safe-path /
line to your configuration file "/builddir/.config/gdb/gdbinit".
For more information about this security protection see the
"Auto-loading safe path" section in the GDB manual.  E.g., run from the shell:
	info "(gdb)Auto-loading safe path"
Function "rb_assert_failure" not defined.
Breakpoint 1 (rb_assert_failure) pending.
Function "rb_bug" not defined.
Breakpoint 2 (rb_bug) pending.
Function "ruby_debug_breakpoint" not defined.
Breakpoint 3 (ruby_debug_breakpoint) pending.
warning: ./breakpoints.gdb: No such file or directory
Download failed: No route to host.  Continuing without debug info for /lib64/libc.so.6.
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Download failed: No route to host.  Continuing without debug info for /lib64/libz.so.1.
Download failed: No route to host.  Continuing without debug info for /lib64/libgmp.so.10.
Download failed: No route to host.  Continuing without debug info for /lib64/libcrypt.so.2.
Download failed: No route to host.  Continuing without debug info for /lib64/libm.so.6.
Download failed: No route to host.  Continuing without debug info for /lib64/libffi.so.6.
Run options: 
  --seed=62

# Running tests:

[Detaching after vfork from child process 635]
[1/8] TestGCCompact#test_ast_compacts
Breakpoint 2, 0x00007ffff7b0be3c in rb_bug (fmt=0x7ffff7df85f8 "Couldn't unprotect page %p") at error.c:768
768	{
Missing separate debuginfos, use: dnf debuginfo-install glibc-2.34.9000-26.fc36.ppc64le gmp-6.2.1-1.fc36.ppc64le libffi-3.1-28.fc34.ppc64le libxcrypt-4.4.26-4.fc36.ppc64le zlib-1.2.11-30.fc35.ppc64le
(gdb) bt
#0  0x00007ffff7b0be3c in rb_bug (fmt=0x7ffff7df85f8 "Couldn't unprotect page %p") at error.c:768
#1  0x00007ffff7de15f0 in unlock_page_body (objspace=<optimized out>, body=0x1004f4000) at gc.c:4505
#2  gc_fill_swept_page (empty_slots=<synthetic pointer>, freed_slots=<synthetic pointer>, sweep_page=0x100055a90, heap=<optimized out>, objspace=<optimized out>) at gc.c:4780
#3  gc_page_sweep (sweep_page=0x100055a90, heap=0x100051528, objspace=<optimized out>) at gc.c:4955
#4  gc_sweep_step.isra.0 (objspace=0x100051500, heap=0x100051528) at gc.c:5100
#5  0x00007ffff7bbfb0c in gc_sweep_rest (objspace=<optimized out>) at gc.c:5153
#6  gc_sweep (objspace=0x100051500) at gc.c:5270
#7  0x00007ffff7bcba40 in gc_marks (full_mark=<optimized out>, objspace=0x100051500) at gc.c:7465
#8  gc_start (objspace=objspace@entry=0x100051500, reason=<optimized out>) at gc.c:8314
#9  0x00007ffff7bcbea0 in garbage_collect (objspace=objspace@entry=0x100051500, reason=reason@entry=238592) at gc.c:8202
#10 0x00007ffff7bcc528 in gc_start_internal (compact=20, immediate_sweep=20, immediate_mark=20, full_mark=20, self=4295821160, ec=0x100051bc0) at gc.c:8545
#11 gc_compact (ec=0x100051bc0, self=4295821160) at gc.c:9456
#12 0x00007ffff7d95444 in builtin_invoker0 (ec=<optimized out>, self=<optimized out>, argv=<optimized out>, funcptr=<optimized out>) at vm_insnhelper.c:5445
#13 0x00007ffff7d9fc8c in vm_exec_core (ec=0x100051bc0, initial=<optimized out>) at insns.def:1493
#14 0x00007ffff7dbc580 in rb_vm_exec (ec=0x100051bc0, mjit_enable_p=<optimized out>) at vm.c:2172
#15 0x00007ffff7da74c8 in invoke_block (captured=<optimized out>, captured=<optimized out>, opt_pc=<optimized out>, type=572653569, cref=0x0, self=4298277080, iseq=0x100418ba0, ec=0x100051bc0)
    at vm_insnhelper.c:400
#16 invoke_iseq_block_from_c (me=0x0, is_lambda=<optimized out>, cref=0x0, passed_block_handler=0, kw_splat=0, argv=0x7fffffffc348, argc=1, self=4298277080, captured=<optimized out>, ec=0x100051bc0)
    at vm.c:1335
#17 invoke_block_from_c_bh (force_blockarg=<optimized out>, is_lambda=<optimized out>, cref=<optimized out>, passed_block_handler=<optimized out>, kw_splat=<optimized out>, argv=<optimized out>, 
    argc=<optimized out>, block_handler=<optimized out>, ec=<optimized out>) at vm.c:1353
#18 vm_yield (kw_splat=0, argv=0x7fffffffc348, argc=1, ec=0x100051bc0) at vm.c:1398
#19 rb_yield_0 (argv=0x7fffffffc348, argc=1) at vm_eval.c:1333
#20 rb_yield (val=<optimized out>) at vm_eval.c:1349
#21 0x00007ffff7b14944 in rb_ary_collect (ary=4300161960) at array.c:3635
#22 0x00007ffff7d87724 in ractor_safe_call_cfunc_0 (recv=<optimized out>, argc=<optimized out>, argv=<optimized out>, func=<optimized out>) at vm_insnhelper.c:2748
#23 0x00007ffff7d92470 in vm_call_cfunc_with_frame (ec=0x100051bc0, reg_cfp=0x7ffff74cfb40, calling=<optimized out>) at vm_insnhelper.c:2931
#24 0x00007ffff7d9792c in vm_sendish (ec=0x100051bc0, reg_cfp=0x7ffff74cfb40, cd=0x10042cc30, block_handler=<optimized out>, method_explorer=<optimized out>) at vm_callinfo.h:336
#25 0x00007ffff7d9efd0 in vm_exec_core (ec=0x100051bc0, initial=<optimized out>) at insns.def:770
#26 0x00007ffff7dbc580 in rb_vm_exec (ec=0x100051bc0, mjit_enable_p=<optimized out>) at vm.c:2172
#27 0x00007ffff7da74c8 in invoke_block (captured=<optimized out>, captured=<optimized out>, opt_pc=<optimized out>, type=572653569, cref=0x0, self=4298277080, iseq=0x10037e618, ec=0x100051bc0)
    at vm_insnhelper.c:400
#28 invoke_iseq_block_from_c (me=0x0, is_lambda=<optimized out>, cref=0x0, passed_block_handler=0, kw_splat=0, argv=0x7fffffffcc38, argc=1, self=4298277080, captured=<optimized out>, ec=0x100051bc0)
    at vm.c:1335
#29 invoke_block_from_c_bh (force_blockarg=<optimized out>, is_lambda=<optimized out>, cref=<optimized out>, passed_block_handler=<optimized out>, kw_splat=<optimized out>, argv=<optimized out>, 
    argc=<optimized out>, block_handler=<optimized out>, ec=<optimized out>) at vm.c:1353
#30 vm_yield (kw_splat=0, argv=0x7fffffffcc38, argc=1, ec=0x100051bc0) at vm.c:1398
#31 rb_yield_0 (argv=0x7fffffffcc38, argc=1) at vm_eval.c:1333
#32 rb_yield (val=<optimized out>) at vm_eval.c:1349
#33 0x00007ffff7b145b4 in rb_ary_each (ary=<optimized out>) at array.c:2523
#34 rb_ary_each (ary=4300163800) at array.c:2517
#35 0x00007ffff7d87724 in ractor_safe_call_cfunc_0 (recv=<optimized out>, argc=<optimized out>, argv=<optimized out>, func=<optimized out>) at vm_insnhelper.c:2748
#36 0x00007ffff7d92470 in vm_call_cfunc_with_frame (ec=0x100051bc0, reg_cfp=0x7ffff74cfc58, calling=<optimized out>) at vm_insnhelper.c:2931
#37 0x00007ffff7d9792c in vm_sendish (ec=0x100051bc0, reg_cfp=0x7ffff74cfc58, cd=0x100278a90, block_handler=<optimized out>, method_explorer=<optimized out>) at vm_callinfo.h:336
#38 0x00007ffff7d9efd0 in vm_exec_core (ec=0x100051bc0, initial=<optimized out>) at insns.def:770
#39 0x00007ffff7dbc580 in rb_vm_exec (ec=0x100051bc0, mjit_enable_p=<optimized out>) at vm.c:2172
#40 0x00007ffff7da74c8 in invoke_block (captured=<optimized out>, captured=<optimized out>, opt_pc=<optimized out>, type=572653569, cref=0x0, self=4298277080, iseq=0x10043bb00, ec=0x100051bc0)
    at vm_insnhelper.c:400
#41 invoke_iseq_block_from_c (me=0x0, is_lambda=<optimized out>, cref=0x0, passed_block_handler=0, kw_splat=0, argv=0x7fffffffd528, argc=1, self=4298277080, captured=<optimized out>, ec=0x100051bc0)
    at vm.c:1335
#42 invoke_block_from_c_bh (force_blockarg=<optimized out>, is_lambda=<optimized out>, cref=<optimized out>, passed_block_handler=<optimized out>, kw_splat=<optimized out>, argv=<optimized out>, 
    argc=<optimized out>, block_handler=<optimized out>, ec=<optimized out>) at vm.c:1353
#43 vm_yield (kw_splat=0, argv=0x7fffffffd528, argc=1, ec=0x100051bc0) at vm.c:1398
#44 rb_yield_0 (argv=0x7fffffffd528, argc=1) at vm_eval.c:1333
#45 rb_yield (val=<optimized out>) at vm_eval.c:1349
#46 0x00007ffff7b145b4 in rb_ary_each (ary=<optimized out>) at array.c:2523
#47 rb_ary_each (ary=4300164280) at array.c:2517
--Type <RET> for more, q to quit, c to continue without paging--
#48 0x00007ffff7d87724 in ractor_safe_call_cfunc_0 (recv=<optimized out>, argc=<optimized out>, argv=<optimized out>, func=<optimized out>) at vm_insnhelper.c:2748
#49 0x00007ffff7d92470 in vm_call_cfunc_with_frame (ec=0x100051bc0, reg_cfp=0x7ffff74cfde0, calling=<optimized out>) at vm_insnhelper.c:2931
#50 0x00007ffff7d9792c in vm_sendish (ec=0x100051bc0, reg_cfp=0x7ffff74cfde0, cd=0x100435660, block_handler=<optimized out>, method_explorer=<optimized out>) at vm_callinfo.h:336
#51 0x00007ffff7d9efd0 in vm_exec_core (ec=0x100051bc0, initial=<optimized out>) at insns.def:770
#52 0x00007ffff7dbc580 in rb_vm_exec (ec=0x100051bc0, mjit_enable_p=<optimized out>) at vm.c:2172
#53 0x00007ffff7dafab0 in invoke_block (captured=<optimized out>, captured=<optimized out>, opt_pc=<optimized out>, type=<optimized out>, cref=0x0, self=4298951640, iseq=0x10031b888, ec=0x100051bc0)
    at vm_insnhelper.c:400
#54 invoke_iseq_block_from_c (me=0x0, is_lambda=0, cref=0x0, passed_block_handler=0, kw_splat=0, argv=0x100329fd8, argc=0, self=4298951640, captured=<optimized out>, ec=0x100051bc0) at vm.c:1335
#55 invoke_block_from_c_proc (me=0x0, is_lambda=<optimized out>, passed_block_handler=0, kw_splat=0, argv=0x100329fd8, argc=0, self=4298951640, proc=<optimized out>, ec=0x100051bc0) at vm.c:1435
#56 vm_invoke_proc (ec=0x100051bc0, proc=<optimized out>, self=4298951640, argc=<optimized out>, argv=0x100329fd8, kw_splat=<optimized out>, passed_block_handler=0) at vm.c:1464
#57 0x00007ffff7dbd8f0 in rb_vm_invoke_proc (ec=<optimized out>, proc=<optimized out>, argc=<optimized out>, argv=<optimized out>, kw_splat=<optimized out>, passed_block_handler=<optimized out>) at vm.c:1485
#58 0x00007ffff7c9c2dc in rb_proc_call (self=<optimized out>, args=<optimized out>) at proc.c:986
#59 0x00007ffff7ba2650 in rb_call_end_proc (data=4298928280) at eval_jump.c:13
#60 0x00007ffff7b9bb24 in exec_end_procs_chain (procs=procs@entry=0x7ffff7f2a778 <end_procs.lto_priv>, errp=errp@entry=0x100051c38) at eval_jump.c:105
#61 0x00007ffff7b9bc18 in rb_ec_exec_end_proc (ec=ec@entry=0x100051bc0) at eval_jump.c:120
#62 0x00007ffff7b9bef8 in rb_ec_teardown (ec=ec@entry=0x100051bc0) at eval.c:175
#63 0x00007ffff7ba3458 in rb_ec_cleanup (ec=ec@entry=0x100051bc0, ex=<optimized out>) at eval.c:243
#64 0x00007ffff7ba3d98 in ruby_run_node (n=0x1002f0bb0) at eval.c:375
#65 0x0000000100010228 in main (argc=<optimized out>, argv=<optimized out>) at ./main.c:50


(gdb) l
4500	
4501	    if (!VirtualProtect(body, HEAP_PAGE_SIZE, PAGE_READWRITE, &old_protect)) {
4502	#else
4503	    if(mprotect(body, HEAP_PAGE_SIZE, PROT_READ | PROT_WRITE)) {
4504	#endif
4505	        rb_bug("Couldn't unprotect page %p", (void *)body);
4506	    } else {
4507	        gc_report(5, objspace, "Unprotecting page in move %p\n", (void *)body);
4508	    }
4509	}

Updated by peterzhu2118 (Peter Zhu) 12 months ago

I debugged this today. Can you try with commits 0130e17a410d60a10e7041ce98748b8de6946971 and 32b7dcfb56a417c1d1c354102351fc1825d653bf cherry-picked and then apply the attached patch? I was able to get it working on ppc64.

Updated by vo.x (Vit Ondruch) 12 months ago

peterzhu2118 (Peter Zhu) wrote in #note-16:

I debugged this today. Can you try with commits 0130e17a410d60a10e7041ce98748b8de6946971 and 32b7dcfb56a417c1d1c354102351fc1825d653bf cherry-picked and then apply the attached patch? I was able to get it working on ppc64.

This is the result:

  1) Error:
TestGCCompact#test_ast_compacts:
NotImplementedError: Compaction isn't available on this platform
    <internal:gc>:213:in `compact'
    /builddir/build/BUILD/ruby-3.0.3/test/ruby/test_gc_compact.rb:146:in `test_ast_compacts'
  2) Error:
TestGCCompact#test_compact_count:
NotImplementedError: Compaction isn't available on this platform
    <internal:gc>:213:in `compact'
    /builddir/build/BUILD/ruby-3.0.3/test/ruby/test_gc_compact.rb:152:in `test_compact_count'
  3) Error:
TestGCCompact#test_complex_hash_keys:
NotImplementedError: Compaction isn't available on this platform
    <internal:gc>:231:in `verify_compaction_references'
    /builddir/build/BUILD/ruby-3.0.3/test/ruby/test_gc_compact.rb:130:in `test_complex_hash_keys'
  4) Error:
TestGCCompact#test_gc_compact_stats:
NotImplementedError: Compaction isn't available on this platform
    <internal:gc>:213:in `compact'
    /builddir/build/BUILD/ruby-3.0.3/test/ruby/test_gc_compact.rb:91:in `test_gc_compact_stats'

Looking at the patch, you might be interested in the following configuration bits:

checking for mmap... yes
checking for sys/user.h... yes
checking whether PAGE_SIZE is compile-time const... no

Comparing to the x86_64 build:

checking whether PAGE_SIZE is compile-time const... yes

Updated by vo.x (Vit Ondruch) 12 months ago

Checking on CentOS Steam 9, it is passing on ppc:

https://kojihub.stream.rdu2.redhat.com/koji/taskinfo?taskID=848077

The configure checks:

checking for mmap... yes
checking for sys/user.h... yes
checking whether PAGE_SIZE is compile-time const... no

Updated by vo.x (Vit Ondruch) 12 months ago

vo.x (Vit Ondruch) wrote in #note-18:

Checking on CentOS Steam 9, it is passing on ppc:

https://kojihub.stream.rdu2.redhat.com/koji/taskinfo?taskID=848077

The configure checks:

checking for mmap... yes
checking for sys/user.h... yes
checking whether PAGE_SIZE is compile-time const... no

Hups, scratch that, it fails also on c9s:

 1) Error:
TestGCCompact#test_ast_compacts:
NotImplementedError: Compaction isn't available on this platform
    <internal:gc>:213:in `compact'
    /builddir/build/BUILD/ruby-3.0.3/test/ruby/test_gc_compact.rb:146:in `test_ast_compacts'
  2) Error:
TestGCCompact#test_compact_count:
NotImplementedError: Compaction isn't available on this platform
    <internal:gc>:213:in `compact'
    /builddir/build/BUILD/ruby-3.0.3/test/ruby/test_gc_compact.rb:152:in `test_compact_count'
  3) Error:
TestGCCompact#test_complex_hash_keys:
NotImplementedError: Compaction isn't available on this platform
    <internal:gc>:231:in `verify_compaction_references'
    /builddir/build/BUILD/ruby-3.0.3/test/ruby/test_gc_compact.rb:130:in `test_complex_hash_keys'
  4) Error:
TestGCCompact#test_gc_compact_stats:
NotImplementedError: Compaction isn't available on this platform
    <internal:gc>:213:in `compact'
    /builddir/build/BUILD/ruby-3.0.3/test/ruby/test_gc_compact.rb:91:in `test_gc_compact_stats'

Not sure why the verbose log does not contain F by the test list ...

Updated by peterzhu2118 (Peter Zhu) 12 months ago

I am able to repro this. Since ppc64 uses 64KB pages for mmap, we can't use mmap to allocate memory for Ruby pages (since they are 16KB). Because we can't use mmap, we can't use mprotect used by the read barriers of compaction, so we can't use compaction. I forgot to include an additional change to the patch to skip compaction tests on those systems. Sorry about that. I've attached the new patch.

Updated by vo.x (Vit Ondruch) 12 months ago

Thx, the latest version passes the test suite everywhere and your explanation why it does not work makes sense.

Nevertheless, this makes me wonder in case there are 64 KB pages used for mmap, why Ruby wont use 64 KB pages as well? I assume that the that the 64 KB pages are used not just for mmap but also for other allocations, but I am far from understanding memory pages and what not. Or maybe this is something in works for Ruby 3.1 ...

Sorry for my naive questions and thx for looking into this.

Updated by peterzhu2118 (Peter Zhu) 12 months ago

Thank you for checking the patch! Historically, Ruby has been using 16KB pages, so there's assumptions in the GC about this. This wasn't a problem on 64KB page size systems when we were using posix_memalign, but we can no longer use that with compaction (the change from posix_memalign to mmap was made this year). I will look into allocating pages larger than 16KB so we can use mmap on these platforms.

Updated by vo.x (Vit Ondruch) 8 months ago

@peterzhu2118 (Peter Zhu) I wonder what is the status here. I think you have requested backport of the mmap patch in #18394. However, it seems it have not happened. Was it intentional?

Actions #24

Updated by vo.x (Vit Ondruch) 8 months ago

  • Related to Bug #18746: /TestGCCompact#test_(ast_compacts|compact_count|complex_hash_keys|gc_compact_stats)/ fails on PPC added

Updated by peterzhu2118 (Peter Zhu) 8 months ago

I changed the backport status of #18394 for Ruby 3.0 since it doesn't look like the patch was correctly applied.

Updated by hsbt (Hiroshi SHIBATA) 9 days ago

  • Assignee set to peterzhu2118 (Peter Zhu)
Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0