Project

General

Profile

Actions

Bug #17871

closed

TestGCCompact#test_ast_compacts test failing again

Added by jaruga (Jun Aruga) 2 months ago. Updated 2 months ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Target version:
-
[ruby-core:103892]

Description

This issue was found by mame (Yusuke Endoh) yesterday on our new Power 9 server. I would like to open the ticket.

The test failure was reported and fixed on the #17306 6 months ago.
However on the latest master adcbae8d49ec04d365ce13274783b1495c3c7d0e, Power 9 (ppc64le) Ubuntu focal, I see the TestGCCompact#test_ast_compacts test fails.

$ lscpu | head -3
Architecture:                    ppc64le
Byte Order:                      Little Endian
CPU(s):                          8

$ lscpu | grep ^Model
Model:                           2.2 (pvr 004e 1202)
Model name:                      POWER9 (architected), altivec supported

$ uname -m
ppc64le

$ cat /etc/os-release | head -3
NAME="Ubuntu"
VERSION="20.04.2 LTS (Focal Fossa)"
ID=ubuntu

$ gcc --version
gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
Copyright (C) 2019 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

$ autoconf
$ ./configure \
  --prefix=${HOME}/local/ruby-master-adcbae8 \
  --enable-shared
$ make
$ make install
$ make check 2>&1 | tee check.log
...
<internal:gc>:213: [BUG] Couldn't unprotect page 0x00000a1b33ba8000
ruby 3.1.0dev (2021-05-19T05:24:01Z master adcbae8d49) [powerpc64le-linux]

-- Control frame information -----------------------------------------------
c:0031 p:0003 s:0174 e:000173 METHOD <internal:gc>:213
c:0030 p:0026 s:0170 e:000169 METHOD /home/jaruga/git/ruby/ruby/test/ruby/test_gc_compact.rb:154
...
c:0001 p:0000 s:0003 E:001550 (none) [FINISH]
...

-- C level backtrace information -------------------------------------------
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(rb_vm_bugreport+0x1b4) [0x7441be9c78f4] vm_dump.c:759
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(rb_bug_without_die+0x9c) [0x7441be740dec] error.c:777
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(die+0x0) [0x7441be6807dc] error.c:785
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(rb_bug) error.c:787
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(unlock_page_body+0x10) [0x7441be771e80] gc.c:4889
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(gc_fill_swept_page) gc.c:5184
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(gc_page_sweep) gc.c:5392
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(gc_sweep_step) gc.c:5562
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(gc_sweep_rest+0x24) [0x7441be77208c] gc.c:5619
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(gc_sweep) gc.c:5737
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(gc_marks+0x15c) [0x7441be778d08] gc.c:8024
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(gc_start) gc.c:8854
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(rb_multi_ractor_p+0x0) [0x7441be77aa38] gc.c:8742
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(rb_vm_lock_leave) vm_sync.h:92
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(garbage_collect) gc.c:8744
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(gc_start_internal) gc.c:9086
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(gc_compact) gc.c:10002
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(builtin_invoker0+0x24) [0x7441be988d34] vm_insnhelper.c:5429
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(vm_exec_core+0x1914) [0x7441be9a8b14] vm_insnhelper.c:5569
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(rb_vm_exec+0x14c) [0x7441be9acfdc] vm.c:2169
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(rb_yield+0x288) [0x7441be9b23c8] vm.c:1260
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(rb_ary_collect+0x74) [0x7441be68ca24] array.c:3646
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(ractor_safe_call_cfunc_0+0x24) [0x7441be9885b4] vm_insnhelper.c:2760
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(vm_call_cfunc_with_frame+0x140) [0x7441be993d30] vm_insnhelper.c:2943
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(vm_sendish+0x364) [0x7441be9a2ad4] vm_insnhelper.c:4516
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(vm_exec_core+0x294) [0x7441be9a7494] insns.def:754
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(rb_vm_exec+0x14c) [0x7441be9acfdc] vm.c:2169
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(rb_yield+0x288) [0x7441be9b23c8] vm.c:1260
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(rb_ary_each+0x54) [0x7441be683844] array.c:2534
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(ractor_safe_call_cfunc_0+0x24) [0x7441be9885b4] vm_insnhelper.c:2760
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(vm_call_cfunc_with_frame+0x140) [0x7441be993d30] vm_insnhelper.c:2943
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(vm_call_method_each_type+0xc0) [0x7441be9ae6d0] vm_insnhelper.c:3433
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(vm_call_method+0xdc) [0x7441be9aeeec] vm_insnhelper.c:3537
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(vm_call_method_each_type+0x510) [0x7441be9aeb20] vm_insnhelper.c:3412
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(vm_call_method+0xdc) [0x7441be9aeeec] vm_insnhelper.c:3537
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(vm_sendish+0x364) [0x7441be9a2ad4] vm_insnhelper.c:4516
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(vm_exec_core+0x294) [0x7441be9a7494] insns.def:754
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(rb_vm_exec+0x14c) [0x7441be9acfdc] vm.c:2169
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(rb_yield+0x288) [0x7441be9b23c8] vm.c:1260
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(rb_ary_each+0x54) [0x7441be683844] array.c:2534
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(ractor_safe_call_cfunc_0+0x24) [0x7441be9885b4] vm_insnhelper.c:2760
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(vm_call_cfunc_with_frame+0x140) [0x7441be993d30] vm_insnhelper.c:2943
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(vm_call_method_each_type+0xc0) [0x7441be9ae6d0] vm_insnhelper.c:3433
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(vm_call_method+0xdc) [0x7441be9aeeec] vm_insnhelper.c:3537
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(vm_call_method_each_type+0x510) [0x7441be9aeb20] vm_insnhelper.c:3412
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(vm_call_method+0xdc) [0x7441be9aeeec] vm_insnhelper.c:3537
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(vm_sendish+0x364) [0x7441be9a2ad4] vm_insnhelper.c:4516
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(vm_exec_core+0x294) [0x7441be9a7494] insns.def:754
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(rb_vm_exec+0x14c) [0x7441be9acfdc] vm.c:2169
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(rb_iseq_eval+0x190) [0x7441be9b0bf0] vm.c:2406
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(require_internal+0x998) [0x7441be7cc8c8] load.c:594
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(rb_require_string+0x44) [0x7441be7cd8f4] load.c:1142
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(rb_f_require_relative+0x48) [0x7441be7cd9e8] load.c:857
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(ractor_safe_call_cfunc_1+0x28) [0x7441be988608] vm_insnhelper.c:2767
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(vm_call_cfunc_with_frame+0x140) [0x7441be993d30] vm_insnhelper.c:2943
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(vm_call_method_each_type+0xc0) [0x7441be9ae6d0] vm_insnhelper.c:3433
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(vm_call_method+0xdc) [0x7441be9aeeec] vm_insnhelper.c:3537
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(vm_exec_core+0x1aac) [0x7441be9a8cac] vm_insnhelper.c:4516
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(rb_vm_exec+0x14c) [0x7441be9acfdc] vm.c:2169
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(rb_iseq_eval_main+0xf0) [0x7441be9b0d50] vm.c:2417
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(rb_ec_exec_node+0xb8) [0x7441be74a218] eval.c:317
/home/jaruga/git/ruby/ruby/libruby.so.3.1.0(ruby_run_node+0x7c) [0x7441be74e94c] eval.c:375
/home/jaruga/git/ruby/ruby/ruby(main+0x90) [0xa1b10ce0bb0] ./main.c:47
...
make: *** [uncommon.mk:802: yes-test-all] Aborted (core dumped)
$ make test-all TESTOPTS="-n test_compact_count" TESTS=test/ruby/test_gc_compact.rb
...
# Running tests:

[1/1] TestGCCompact#test_compact_count<internal:gc>:213: [BUG] Couldn't protect page 0x00000f8747228000
ruby 3.1.0dev (2021-05-19T05:24:01Z master adcbae8d49) [powerpc64le-linux]

-- Control frame information -----------------------------------------------
c:0031 p:0003 s:0174 e:000173 METHOD <internal:gc>:213
<internal:gc>:213: [BUG] Couldn't unprotect page 0x00000f8747234000
ruby 3.1.0dev (2021-05-19T05:24:01Z master adcbae8d49) [powerpc64le-linux]

-- Control frame information -----------------------------------------------
c:0031 p:0003 s:0174 e:000173 METHOD <internal:gc>:213
<internal:gc>:213: [BUG] Segmentation fault at 0x00000f8747236360
ruby 3.1.0dev (2021-05-19T05:24:01Z master adcbae8d49) [powerpc64le-linux]

-- Control frame information -----------------------------------------------
c:0031 p:0003 s:0174 e:000173 METHOD <internal:gc>:213
make: *** [uncommon.mk:802: yes-test-all] Segmentation fault (core dumped)

Files

check.log (1.14 MB) check.log jaruga (Jun Aruga), 05/19/2021 08:06 PM

Related issues

Related to Ruby master - Bug #17306: TestGCCompact#test_ast_compacts test failuresClosedtenderlovemaking (Aaron Patterson)Actions

Updated by xtkoba (Tee KOBAYASHI) 2 months ago

Some googling told me that the page size defaults to 64k on powerpc64le for some Linux distros. Possibly sysconf(3) does not report the correct page size?

Updated by jaruga (Jun Aruga) 2 months ago

I had a resistance to upload the full log check.log, because the file size is big. However I would upload the log file now.

$ du -sh check.log 
1.2M    check.log

$ wc -l check.log 
22963 check.log
Actions #3

Updated by mame (Yusuke Endoh) 2 months ago

  • Related to Bug #17306: TestGCCompact#test_ast_compacts test failures added

Updated by mame (Yusuke Endoh) 2 months ago

xtkoba (Tee KOBAYASHI) wrote in #note-1:

Possibly sysconf(3) does not report the correct page size?

I think it reports 64k correctly.

$ ruby -retc -e'p Etc.sysconf(Etc::SC_PAGE_SIZE)'
65536

The fix for #17306 disabled the auto compaction on a platform whose page size is greater than 4k, like ppc64le. However, TestGCCompact#test_ast_compacts is a test for (not auto) compaction, so it is still executed on ppc64le. I'm unsure why the fix addressed the issue of #17306.

Updated by jaruga (Jun Aruga) 2 months ago

Running the each test in test/ruby/test_gc_compact.rb on the above environment, here is the result. The failed test is not only the test_ast_compacts test.

  • test_enable_autocompact : ok
  • test_disable_autocompact : ok
  • test_major_compacts : ok
  • test_implicit_compaction_does_something : ok
  • test_gc_compact_stats : error
  • test_complex_hash_keys : error
  • test_ast_compacts : error
  • test_compact_count : error

Here are the the commands I executed.

$ make test-all TESTS="-v -n test_enable_autocompact test/ruby/test_gc_compact.rb"
  # => ok
$ make test-all TESTS="-v -n test_disable_autocompact test/ruby/test_gc_compact.rb"
  # => ok
$ make test-all TESTS="-v -n test_major_compacts test/ruby/test_gc_compact.rb"
  # => ok
$ make test-all TESTS="-v -n test_implicit_compaction_does_something test/ruby/test_gc_compact.rb"
  # => ok
$ make test-all TESTS="-v -n test_gc_compact_stats test/ruby/test_gc_compact.rb"
  # => error (Segmentation fault)
$ make test-all TESTS="-v -n test_complex_hash_keys test/ruby/test_gc_compact.rb"
  # => error (Segmentation fault)
$ make test-all TESTS="-v -n test_ast_compacts test/ruby/test_gc_compact.rb"
  # => error (<internal:gc>:213: [BUG] Couldn't unprotect page 0x00000216f95c4000)
$ make test-all TESTS="-v -n test_compact_count test/ruby/test_gc_compact.rb"
  # => error (Segmentation fault)

Updated by xtkoba (Tee KOBAYASHI) 2 months ago

mame (Yusuke Endoh) wrote in #note-4:

I'm unsure why the fix addressed the issue of #17306.

This is because after #17306 was fixed the commit 32b7dcfb56a417c1d1c354102351fc1825d653bf changed the behavior of {,un}lock_page_body so that they call mprotect(2) for an explicit compaction, regardless of the page size.

Updated by tenderlovemaking (Aaron Patterson) 2 months ago

xtkoba (Tee KOBAYASHI) wrote in #note-6:

mame (Yusuke Endoh) wrote in #note-4:

I'm unsure why the fix addressed the issue of #17306.

This is because after #17306 was fixed the commit 32b7dcfb56a417c1d1c354102351fc1825d653bf changed the behavior of {,un}lock_page_body so that they call mprotect(2) for an explicit compaction, regardless of the page size.

Ya, that's correct. Even manual compaction requires the mprotect read barrier. Basically we need to disable compaction on platforms that don't support it. I'll make a fix.

Actions #8

Updated by tenderlovemaking (Aaron Patterson) 2 months ago

  • Status changed from Open to Closed

Applied in changeset git|fc832ffbfaf581ff63ef40dc3f4ec5c8ff39aae6.


Disable compaction on platforms that can't support it

Manual compaction also requires a read barrier, so we need to disable
even manual compaction on platforms that don't support mprotect.

[Bug #17871]

Updated by jaruga (Jun Aruga) 2 months ago

tenderlovemaking (Aaron Patterson) thanks for fixing it! I removed the skipped gc tests on Travis ppc64le at the af43198738bf45d55d91d7f48b197f94dc526967 .

Actions

Also available in: Atom PDF