Project

General

Profile

Actions

Bug #14090

open

`TestGc#test_interrupt_in_finalizer` fails very rarely

Added by mame (Yusuke Endoh) almost 4 years ago. Updated 6 months ago.

Status:
Open
Priority:
Normal
Target version:
-
ruby -v:
ruby 2.5.0dev (2017-11-07 trunk 60693) [x86_64-linux]
[ruby-core:83693]
Tags:

Description

TestGc#test_interrupt_in_finalizer fails very rarely, only once every handred or thousand runs.

# Running tests:

[1/1] TestGc#test_interrupt_in_finalizer = 10.13 s
  1) Error:
TestGc#test_interrupt_in_finalizer:
Timeout::Error: execution of assert_in_out_err expired
pid 24697 killed by SIGABRT (signal 6) (core dumped)
| 
| [BUG] Segmentation fault at 0x000003e800006075
| ruby 2.5.0dev (2017-11-07) [x86_64-linux]
| 
| -- Control frame information -----------------------------------------------
| 
| 
| -- Machine register context ------------------------------------------------
|  RIP: 0x00007f80612bb072 RBP: 0x000055c3587e1efc RSP: 0x00007ffc4f8100b0
|  RAX: 0xfffffffffffffffc RBX: 0x000055c3587e1ee4 RCX: 0x00007f80612bb072
|  RDX: 0x0000000000000000 RDI: 0x000055c3587e1efc RSI: 0x0000000000000080
|   R8: 0x00000000000000ca  R9: 0x0000000000000000 R10: 0x0000000000000000
|  R11: 0x0000000000000246 R12: 0x000055c3587e1ed0 R13: 0x00007ffc4f810110
|  R14: 0x000055c3587e1f38 R15: 0x0000000000000003 EFL: 0x0000000000000246
| 
| -- C level backtrace information -------------------------------------------
| /home/mame/work/ruby.tmp/ruby(rb_vm_bugreport+0x7d3) [0x55c357e7a333] vm_dump.c:703
| /home/mame/work/ruby.tmp/ruby(rb_bug_context+0xd1) [0x55c357e6de11] error.c:554
| /home/mame/work/ruby.tmp/ruby(sigsegv+0x42) [0x55c357d5e602] signal.c:928
| /lib/x86_64-linux-gnu/libpthread.so.0 [0x7f80612c0150]
| /lib/x86_64-linux-gnu/libpthread.so.0(pthread_cond_wait+0x152) [0x7f80612bb072]
| /home/mame/work/ruby.tmp/ruby(native_sleep.constprop.79+0x1de) [0x55c357d967fe] thread_pthread.c:340
| /home/mame/work/ruby.tmp/ruby(rb_thread_terminate_all+0x1e0) [0x55c357d9aba0] thread.c:507
| /home/mame/work/ruby.tmp/ruby(ruby_cleanup+0x17e) [0x55c357c6078e] eval.c:188
| /home/mame/work/ruby.tmp/ruby(ruby_run_node+0x36) [0x55c357c60aa6] eval.c:302
| /home/mame/work/ruby.tmp/ruby(main+0x5f) [0x55c357c5ca1f] encoding.c:164
| 
| -- Other runtime information -----------------------------------------------
| 
| * Loaded script: -e
| 
| * Loaded features:
| 
|     0 enumerator.so
|     1 thread.rb
|     2 rational.so
|     3 complex.so
|     4 /home/mame/work/ruby.tmp/.ext/x86_64-linux/enc/encdb.so
|     5 /home/mame/work/ruby.tmp/.ext/x86_64-linux/enc/trans/transdb.so
|     6 /home/mame/work/ruby.tmp/rbconfig.rb
|     7 /home/mame/work/ruby.tmp/lib/rubygems/compatibility.rb
|     8 /home/mame/work/ruby.tmp/lib/rubygems/defaults.rb
|     9 /home/mame/work/ruby.tmp/lib/rubygems/deprecate.rb
|    10 /home/mame/work/ruby.tmp/lib/rubygems/errors.rb
|    11 /home/mame/work/ruby.tmp/lib/rubygems/version.rb
|    12 /home/mame/work/ruby.tmp/lib/rubygems/requirement.rb
|    13 /home/mame/work/ruby.tmp/lib/rubygems/platform.rb
|    14 /home/mame/work/ruby.tmp/lib/rubygems/basic_specification.rb
|    15 /home/mame/work/ruby.tmp/lib/rubygems/stub_specification.rb
|    16 /home/mame/work/ruby.tmp/lib/rubygems/util/list.rb
|    17 /home/mame/work/ruby.tmp/.ext/x86_64-linux/stringio.so
|    18 /home/mame/work/ruby.tmp/lib/rubygems/specification.rb
|    19 /home/mame/work/ruby.tmp/lib/rubygems/exceptions.rb
|    20 /home/mame/work/ruby.tmp/lib/rubygems/core_ext/kernel_gem.rb
|    21 /home/mame/work/ruby.tmp/lib/monitor.rb
|    22 /home/mame/work/ruby.tmp/lib/rubygems/core_ext/kernel_require.rb
|    23 /home/mame/work/ruby.tmp/lib/rubygems.rb
|    24 /home/mame/work/ruby.tmp/lib/rubygems/dependency.rb
|    25 /home/mame/work/ruby.tmp/lib/rubygems/path_support.rb
| 
| * Process memory map:
| 
| 55c357c3a000-55c357f55000 r-xp 00000000 fd:01 21892603                   /home/mame/work/ruby.tmp/ruby
| 55c358155000-55c35815a000 r--p 0031b000 fd:01 21892603                   /home/mame/work/ruby.tmp/ruby
| 55c35815a000-55c35815b000 rw-p 00320000 fd:01 21892603                   /home/mame/work/ruby.tmp/ruby
| 55c35815b000-55c35816c000 rw-p 00000000 00:00 0 
| 55c3587e1000-55c358b0d000 rw-p 00000000 00:00 0                          [heap]
| 7f8058000000-7f8058021000 rw-p 00000000 00:00 0 
| 7f8058021000-7f805c000000 ---p 00000000 00:00 0 
| 7f805e93f000-7f805eb1e000 r--s 00000000 fd:01 15604574                   /lib/x86_64-linux-gnu/libc-2.26.so
| 7f805eb1e000-7f805fb92000 r--s 00000000 fd:01 21892603                   /home/mame/work/ruby.tmp/ruby
| 7f805fb92000-7f805fba8000 r-xp 00000000 fd:01 15597591                   /lib/x86_64-linux-gnu/libgcc_s.so.1
| 7f805fba8000-7f805fda7000 ---p 00016000 fd:01 15597591                   /lib/x86_64-linux-gnu/libgcc_s.so.1
| 7f805fda7000-7f805fda8000 r--p 00015000 fd:01 15597591                   /lib/x86_64-linux-gnu/libgcc_s.so.1
| 7f805fda8000-7f805fda9000 rw-p 00016000 fd:01 15597591                   /lib/x86_64-linux-gnu/libgcc_s.so.1
| 7f805fda9000-7f805feaa000 rw-p 00000000 00:00 0 
| 7f805feaa000-7f805feb3000 r-xp 00000000 fd:01 23333969                   /home/mame/work/ruby.tmp/.ext/x86_64-linux/stringio.so
| 7f805feb3000-7f80600b2000 ---p 00009000 fd:01 23333969                   /home/mame/work/ruby.tmp/.ext/x86_64-linux/stringio.so
| 7f80600b2000-7f80600b3000 r--p 00008000 fd:01 23333969                   /home/mame/work/ruby.tmp/.ext/x86_64-linux/stringio.so
| 7f80600b3000-7f80600b4000 rw-p 00009000 fd:01 23333969                   /home/mame/work/ruby.tmp/.ext/x86_64-linux/stringio.so
| 7f80600b4000-7f80600b6000 r-xp 00000000 fd:01 23333689                   /home/mame/work/ruby.tmp/.ext/x86_64-linux/enc/trans/transdb.so
| 7f80600b6000-7f80602b6000 ---p 00002000 fd:01 23333689                   /home/mame/work/ruby.tmp/.ext/x86_64-linux/enc/trans/transdb.so
| 7f80602b6000-7f80602b7000 r--p 00002000 fd:01 23333689                   /home/mame/work/ruby.tmp/.ext/x86_64-linux/enc/trans/transdb.so
| 7f80602b7000-7f80602b8000 rw-p 00003000 fd:01 23333689                   /home/mame/work/ruby.tmp/.ext/x86_64-linux/enc/trans/transdb.so
| 7f80602b8000-7f80602ba000 r-xp 00000000 fd:01 23333649                   /home/mame/work/ruby.tmp/.ext/x86_64-linux/enc/encdb.so
| 7f80602ba000-7f80604b9000 ---p 00002000 fd:01 23333649                   /home/mame/work/ruby.tmp/.ext/x86_64-linux/enc/encdb.so
| 7f80604b9000-7f80604ba000 r--p 00001000 fd:01 23333649                   /home/mame/work/ruby.tmp/.ext/x86_64-linux/enc/encdb.so
| 7f80604ba000-7f80604bb000 rw-p 00002000 fd:01 23333649                   /home/mame/work/ruby.tmp/.ext/x86_64-linux/enc/encdb.so
| 7f80604bb000-7f8060691000 r-xp 00000000 fd:01 15604574                   /lib/x86_64-linux-gnu/libc-2.26.so
| 7f8060691000-7f8060891000 ---p 001d6000 fd:01 15604574                   /lib/x86_64-linux-gnu/libc-2.26.so
| 7f8060891000-7f8060895000 r--p 001d6000 fd:01 15604574                   /lib/x86_64-linux-gnu/libc-2.26.so
| 7f8060895000-7f8060897000 rw-p 001da000 fd:01 15604574                   /lib/x86_64-linux-gnu/libc-2.26.so
| 7f8060897000-7f806089b000 rw-p 00000000 00:00 0 
| 7f806089b000-7f80609f0000 r-xp 00000000 fd:01 15604578                   /lib/x86_64-linux-gnu/libm-2.26.so
| 7f80609f0000-7f8060bef000 ---p 00155000 fd:01 15604578                   /lib/x86_64-linux-gnu/libm-2.26.so
| 7f8060bef000-7f8060bf0000 r--p 00154000 fd:01 15604578                   /lib/x86_64-linux-gnu/libm-2.26.so
| 7f8060bf0000-7f8060bf1000 rw-p 00155000 fd:01 15604578                   /lib/x86_64-linux-gnu/libm-2.26.so
| 7f8060bf1000-7f8060bfa000 r-xp 00000000 fd:01 15604576                   /lib/x86_64-linux-gnu/libcrypt-2.26.so
| 7f8060bfa000-7f8060df9000 ---p 00009000 fd:01 15604576                   /lib/x86_64-linux-gnu/libcrypt-2.26.so
| 7f8060df9000-7f8060dfa000 r--p 00008000 fd:01 15604576                   /lib/x86_64-linux-gnu/libcrypt-2.26.so
| 7f8060dfa000-7f8060dfb000 rw-p 00009000 fd:01 15604576                   /lib/x86_64-linux-gnu/libcrypt-2.26.so
| 7f8060dfb000-7f8060e29000 rw-p 00000000 00:00 0 
| 7f8060e29000-7f8060e2c000 r-xp 00000000 fd:01 15604577                   /lib/x86_64-linux-gnu/libdl-2.26.so
| 7f8060e2c000-7f806102b000 ---p 00003000 fd:01 15604577                   /lib/x86_64-linux-gnu/libdl-2.26.so
| 7f806102b000-7f806102c000 r--p 00002000 fd:01 15604577                   /lib/x86_64-linux-gnu/libdl-2.26.so
| 7f806102c000-7f806102d000 rw-p 00003000 fd:01 15604577                   /lib/x86_64-linux-gnu/libdl-2.26.so
| 7f806102d000-7f80610ab000 r-xp 00000000 fd:01 12068737                   /usr/lib/x86_64-linux-gnu/libgmp.so.10.3.2
| 7f80610ab000-7f80612ab000 ---p 0007e000 fd:01 12068737                   /usr/lib/x86_64-linux-gnu/libgmp.so.10.3.2
| 7f80612ab000-7f80612ac000 r--p 0007e000 fd:01 12068737                   /usr/lib/x86_64-linux-gnu/libgmp.so.10.3.2
| 7f80612ac000-7f80612ad000 rw-p 0007f000 fd:01 12068737                   /usr/lib/x86_64-linux-gnu/libgmp.so.10.3.2
| 7f80612ad000-7f80612c7000 r-xp 00000000 fd:01 15604589                   /lib/x86_64-linux-gnu/libpthread-2.26.so
| 7f80612c7000-7f80614c6000 ---p 0001a000 fd:01 15604589                   /lib/x86_64-linux-gnu/libpthread-2.26.so
| 7f80614c6000-7f80614c7000 r--p 00019000 fd:01 15604589                   /lib/x86_64-linux-gnu/libpthread-2.26.so
| 7f80614c7000-7f80614c8000 rw-p 0001a000 fd:01 15604589                   /lib/x86_64-linux-gnu/libpthread-2.26.so
| 7f80614c8000-7f80614cc000 rw-p 00000000 00:00 0 
| 7f80614cc000-7f80614f3000 r-xp 00000000 fd:01 15602223                   /lib/x86_64-linux-gnu/ld-2.26.so
| 7f8061592000-7f80615b6000 r--s 00000000 fd:01 15604589                   /lib/x86_64-linux-gnu/libpthread-2.26.so
| 7f80615b6000-7f80616bc000 rw-p 00000000 00:00 0 
| 7f80616cb000-7f80616cc000 ---p 00000000 00:00 0 
| 7f80616cc000-7f80616ec000 rw-p 00000000 00:00 0 
| 7f80616ec000-7f80616ed000 ---p 00000000 00:00 0 
| 7f80616ed000-7f80616f3000 rw-p 00000000 00:00 0 
| 7f80616f3000-7f80616f4000 r--p 00027000 fd:01 15602223                   /lib/x86_64-linux-gnu/ld-2.26.so
| 7f80616f4000-7f80616f5000 rw-p 00028000 fd:01 15602223                   /lib/x86_64-linux-gnu/ld-2.26.so
| 7f80616f5000-7f80616f6000 rw-p 00000000 00:00 0 
| 7ffc4f013000-7ffc4f812000 rw-p 00000000 00:00 0                          [stack]
| 7ffc4f904000-7ffc4f907000 r--p 00000000 00:00 0                          [vvar]
| 7ffc4f907000-7ffc4f909000 r-xp 00000000 00:00 0                          [vdso]
| ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0                  [vsyscall]
| 
| 
| [NOTE]
| You may have encountered a bug in the Ruby interpreter or extension libraries.
| Bug reports are welcome.
| For details: http://www.ruby-lang.org/bugreport.html
| 

    /home/mame/work/ruby.tmp/test/ruby/test_gc.rb:354:in `test_interrupt_in_finalizer'

Finished tests in 10.127763s, 0.0987 tests/s, 0.2962 assertions/s.
1 tests, 3 assertions, 0 failures, 1 errors, 0 skips

How to reproduce

1) Apply this patch. This removes a mitigation of this issue.

diff --git a/thread.c b/thread.c
index bfa903c6a4..dfaf75d1ce 100644
--- a/thread.c
+++ b/thread.c
@@ -507,7 +507,7 @@ rb_thread_terminate_all(void)
         * me when the last sub-thread exit.
         */
        sleeping = 1;
-       native_sleep(th, &tv);
+       native_sleep(th, 0);
        RUBY_VM_CHECK_INTS_BLOCKING(ec);
        sleeping = 0;
    }

2) Run make test-all many times. The following command would be useful.

make && while make test-all TESTOPTS="test/ruby/test_gc.rb -n test_interrupt_in_finalizer"; do date; done

FYI: With execution counter

make && i=0 && while make test-all TESTOPTS="test/ruby/test_gc.rb -n test_interrupt_in_finalizer"; do echo; date; echo "trial:$i"; i=`expr $i + 1`; done

Details

TestGc#test_interrupt_in_finalizer checks if SIGINT can interrupt the GC finalizers. This test itself runs on a child process, and the process should end with SIGINT. If the process does not end in ten seconds, the parent sends SIGSEGV to the child, terminates the test, and reports it as a failure. ("C level backtrace information" has "sigsegv", but don't worry, this SEGV would be the one the parent sent. I guess this bug is not so significant, parhaps.)

When a main thread of Ruby process ends, it terminates all child threads and waits for them. However, for unknown reason (maybe depending upon the timing of SIGINT?), it sometimes fails synchronization: all child threads end, and the main thread meaninglessly waits forever.

Based on Ko1's proposal, I committed a tiny change to mitigate this issue at r60694: instead of waiting forever, the main thread wakes up every one second to monitor all child threads. This is not an essential solution for this issue, but just hides. To debug this issue, we need remove the mitigation by the patch described above.

Actions #1

Updated by naruse (Yui NARUSE) almost 4 years ago

  • Target version deleted (2.5)

Updated by jeremyevans0 (Jeremy Evans) 6 months ago

I tried the following patch to disable the mitigation:

diff --git a/thread.c b/thread.c
index 4f2debdaf2..0fdef76445 100644
--- a/thread.c
+++ b/thread.c
@@ -601,13 +601,12 @@ rb_thread_terminate_all(rb_thread_t *th)
        terminate_all(cr, th);

        while (rb_ractor_living_thread_num(cr) > 1) {
-            rb_hrtime_t rel = RB_HRTIME_PER_SEC;
            /*q
             * Thread exiting routine in thread_start_func_2 notify
             * me when the last sub-thread exit.
             */
            sleeping = 1;
-           native_sleep(th, &rel);
+           native_sleep(th, 0);
            RUBY_VM_CHECK_INTS_BLOCKING(ec);
            sleeping = 0;
        }

I then ran the test_interrupt_in_finalizer test 10,000 times without it failing. Are we sure this is still a problem? Maybe it depends on the operating system, as I'm running on OpenBSD and I'm guessing the problem was observed on Linux?

The reason I bring this up is I want to see if https://github.com/ruby/ruby/pull/4366 fixed this issue, but I can't reproduce the initial problem.

Updated by xtkoba (Tee KOBAYASHI) 6 months ago

This seems to be still a problem with a recent revision of Ruby on a recent GNU/Linux environment:

Wed Apr 28 02:39:30 JST 2021
trial:3673
../revision.h unchanged
Run options:
  --seed=50002
  "--ruby=./miniruby -I../lib -I. -I.ext/common  ../tool/runruby.rb --extout=.ext  -- --disable-gems"
  --excludes-dir=../test/excludes
  --name=!/memory_leak/
  -n
  test_interrupt_in_finalizer

# Running tests:

[1/1] TestGc#test_interrupt_in_finalizer = 11.03 s
  1) Error:
TestGc#test_interrupt_in_finalizer:
Timeout::Error: execution of assert_in_out_err expired timeout (10 sec)
pid 7973 killed by SIGABRT (signal 6)
|
| -e: [BUG] Segmentation fault at 0x000003e800001f23
| ruby 3.1.0dev (2021-04-24T02:17:20Z master dee58d7ae7) [x86_64-linux]
|   built with clang version 12.0.0
|
| -- Control frame information -----------------------------------------------
| c:0001 p:0000 s:0003 E:001140 (none) [FINISH]
|
|
| -- Machine register context ------------------------------------------------
|  RIP: 0x00007fd50dfd13d6 RBP: 0x0000562211811fe0 RSP: 0x00007ffdd9599a40
|  RAX: 0xfffffffffffffffc RBX: 0x0000562211812ce8 RCX: 0x00007fd50dfd13d6
|  RDX: 0x0000000000000000 RDI: 0x00007ffdd9599aa0 RSI: 0x0000000000000002
|   R8: 0x0000000000000008  R9: 0x0000000000000000 R10: 0x0000000000000000
|  R11: 0x0000000000000293 R12: 0x0000000000000000 R13: 0x0000562211812cc0
|  R14: 0x0000000000000000 R15: 0x0000562211811fe0 EFL: 0x0000000000000293
|
| -- C level backtrace information -------------------------------------------

    /mnt/1TB/ruby.build/ruby-trunk-dee58d7/test/ruby/test_gc.rb:382:in `test_interrupt_in_finalizer'

Finished tests in 11.106566s, 0.0900 tests/s, 0.2701 assertions/s.
1 tests, 3 assertions, 0 failures, 1 errors, 0 skips

ruby -v: ruby 3.1.0dev (2021-04-24T02:17:20Z master dee58d7ae7) [x86_64-linux]
  built with clang version 12.0.0
make: *** [uncommon.mk:802: yes-test-all] Error 1

Updated by jeremyevans0 (Jeremy Evans) 6 months ago

xtkoba (Tee KOBAYASHI) wrote in #note-3:

This seems to be still a problem with a recent revision of Ruby on a recent GNU/Linux environment:

xtkoba (Tee KOBAYASHI) Thank you very much for testing. Are you able to check whether https://github.com/ruby/ruby/pull/4366 resolves the issue?

Updated by xtkoba (Tee KOBAYASHI) 6 months ago

jeremyevans0 (Jeremy Evans) I tried the PR and it does not seem to resolve the intermittent timeout issue in test_interrupt_in_finalizer when the mitigation is disabled.

Actions

Also available in: Atom PDF