Project

General

Profile

Bug #16921

s390x: random test failures for timeout or segmentation fault

Added by jaruga (Jun Aruga) 6 months ago. Updated 4 months ago.

Status:
Open
Priority:
Normal
Assignee:
-
Target version:
-
[ruby-core:98569]

Description

I observed the following error related to timeout or segmentation fault on Ruby 2.7.1 s390x in Fedora builds. The tests somtimes fail. It seems the s390x machine used in CI is slow. It's great if we can change the timeout casually if it essentially comes from the length of the timeout.

It might be related to #16492 .

  1) Failure:
Fiddle::TestFunction#test_nogvl_poll [/builddir/build/BUILD/ruby-2.7.1/test/fiddle/test_function.rb:95]:
slept amount of time.
Expected |200 - 714| (514) to be <= 180.
  2) Error:
TestProcess#test_status_quit:
Timeout::Error: execution of assert_in_out_err expired timeout (10 sec)
pid 2682640 killed by SIGKILL (signal 9)
|-
    /builddir/build/BUILD/ruby-2.7.1/test/ruby/test_process.rb:1446:in `block in test_status_quit'
    /builddir/build/BUILD/ruby-2.7.1/test/ruby/test_process.rb:37:in `block (2 levels) in with_tmpchdir'
    /builddir/build/BUILD/ruby-2.7.1/test/ruby/test_process.rb:36:in `chdir'
    /builddir/build/BUILD/ruby-2.7.1/test/ruby/test_process.rb:36:in `block in with_tmpchdir'
    /builddir/build/BUILD/ruby-2.7.1/lib/tmpdir.rb:89:in `mktmpdir'
    /builddir/build/BUILD/ruby-2.7.1/test/ruby/test_process.rb:34:in `with_tmpchdir'
    /builddir/build/BUILD/ruby-2.7.1/test/ruby/test_process.rb:1445:in `test_status_quit'
  3) Error:
TestRubyOptions#test_segv_test:
Timeout::Error: execution of assert_in_out_err expired timeout (10 sec)
pid 2683275 killed by SIGABRT (signal 6) (core dumped)
  1) Error:
TestRubyOptions#test_segv_loaded_features:
Timeout::Error: execution of assert_in_out_err expired timeout (10 sec)
pid 2181948 killed by SIGKILL (signal 9)
|-
| -e:1: [BUG] Segmentation fault at 0x00214b3c000003e8
  2) Error:
TestRubyOptions#test_segv_setproctitle:
Timeout::Error: execution of assert_in_out_err expired timeout (10 sec)
pid 2181951 killed by SIGKILL (signal 9)
|-
| -e:1: [BUG] Segmentation fault at 0x00214b3f000003e8
  1) Error:
TestSignal#test_ignored_interrupt:
Timeout::Error: execution of assert_in_out_err expired timeout (10 sec)
pid 2032663 killed by SIGKILL (signal 9)
| 
    /builddir/build/BUILD/ruby-2.7.1/test/ruby/test_signal.rb:294:in `test_ignored_interrupt'
Finished tests in 564.613810s, 37.2130 tests/s, 4826.1873 assertions/s.
#1

Updated by jaruga (Jun Aruga) 6 months ago

  • Subject changed from s390x: ramdom test failures for timeout or segmentation fault to s390x: random test failures for timeout or segmentation fault

Updated by mame (Yusuke Endoh) 6 months ago

Try an environment variable RUBY_TEST_TIMEOUT_SCALE. If you set RUBY_TEST_TIMEOUT_SCALE=10, all timeout tests wait 10 times longer.

Updated by jaruga (Jun Aruga) 6 months ago

Thanks for the info! I will try the environment variable, and will let you know here.

Updated by vo.x (Vit Ondruch) 6 months ago

I think there is combination of two issues. There is possibly bug in EnvUtil.invoke_ruby 1, which cannot properly handle failures caused by allocation in sigsev handler. I mildly remember I was trying to debug the issue, because the TestRubyOptions#test_segv_setproctitle error is nothing new: #13758

Updated by jaruga (Jun Aruga) 4 months ago

Try an environment variable RUBY_TEST_TIMEOUT_SCALE. If you set RUBY_TEST_TIMEOUT_SCALE=10, all timeout tests wait 10 times longer.

Sorry for late response. I confirmed the environment variable improves. You can see https://bugs.ruby-lang.org/issues/16492#note-8 for detail.

By the way, I would like to show the existing related issues here again.

  • TestBugReporter#test_bug_reporter_add #16492
  • TestRubyOptions#test_segv_setproctitle #13758

Updated by jaruga (Jun Aruga) 4 months ago

I sent this PR related to this ticket.
https://github.com/ruby/ruby/pull/3354

It's to apply test-scale to test_nogvl_poll.
Other tests I showed above is applying time-scale.

I ran the Ruby tests with RUBY_TEST_TIMEOUT_SCALE=100 make check 5 times on Fedora build system s390x . And all ok.
So, I am okay to close this ticket.

Updated by jaruga (Jun Aruga) 4 months ago

Can we apply the patch mentioned at https://bugs.ruby-lang.org/issues/16492#note-11 to the following tests using assert_in_out_err in those too?

  • TestProcess#test_status_quit
  • TestRubyOptions#test_segv_test
  • TestRubyOptions#test_segv_loaded_features
  • TestRubyOptions#test_segv_setproctitle
  • TestSignal#test_ignored_interrupt

Also available in: Atom PDF