Bug #15986
closed`TestJIT#test_block_handler_with_possible_frame_omitted_inlining` fails on s390x and armv7hl
Added by vo.x (Vit Ondruch) about 5 years ago. Updated about 5 years ago.
Description
I am trying to build the Ruby 2.7 snapshot for Fedora Rawhide 1, but I observe the following test failure on s390x and aarch64 platforms:
1) Failure:
TestJIT#test_block_handler_with_possible_frame_omitted_inlining [/builddir/build/BUILD/ruby-2.7.0-d9f8b88b47/test/ruby/test_jit.rb:846]:
Expected 2 times of JIT success, but succeeded 1 times.
script:
"""
def multiply(a, b)
a *= b
end
3.times do
p multiply(7.0, 10.0)
end
"""
stderr:
"""
JIT success (65.9ms): block in <main>@-e:6 -> /tmp/_ruby_mjit_p54157u0.c
gcc: fatal error: output filename may not be empty
compilation terminated.
Successful MJIT finish
"""
.
<2> expected but was
<1>.
Finished tests in 440.892116s, 47.2746 tests/s, 6150.4071 assertions/s.
Files
build-s390x.log (1.29 MB) build-s390x.log | s390x build log | vo.x (Vit Ondruch), 07/10/2019 04:40 PM | |
build-armv7hl.log (1.26 MB) build-armv7hl.log | armv7hl build log | vo.x (Vit Ondruch), 07/10/2019 05:01 PM | |
build-x86_64.log (1.28 MB) build-x86_64.log | x86_64 | vo.x (Vit Ondruch), 07/10/2019 05:01 PM | |
mjit_debug.diff (1.69 KB) mjit_debug.diff | k0kubun (Takashi Kokubun), 07/12/2019 01:07 PM | ||
build-x86_64.log (2.76 KB) build-x86_64.log | vo.x (Vit Ondruch), 07/12/2019 03:03 PM | ||
build-s390x.log (3.03 KB) build-s390x.log | vo.x (Vit Ondruch), 07/12/2019 03:03 PM | ||
build-armv7hl.log (2.86 KB) build-armv7hl.log | vo.x (Vit Ondruch), 07/12/2019 03:04 PM | ||
mjit_debug2.diff (2.66 KB) mjit_debug2.diff | k0kubun (Takashi Kokubun), 07/12/2019 03:48 PM | ||
build-armv7hl.log (4.48 KB) build-armv7hl.log | vo.x (Vit Ondruch), 07/15/2019 01:14 PM | ||
build-s390x.log (5.06 KB) build-s390x.log | vo.x (Vit Ondruch), 07/15/2019 01:14 PM | ||
build-x86_64.log (5.09 KB) build-x86_64.log | vo.x (Vit Ondruch), 07/15/2019 01:14 PM |
Updated by k0kubun (Takashi Kokubun) about 5 years ago
- Status changed from Open to Assigned
- Assignee set to k0kubun (Takashi Kokubun)
Thanks to report. I'd like to know more about the context to fix the issue.
- Does the error happen at the same place when you retry running the tests?
- If so, could you share the output of the following command and all .c/.h files referenced in it?
$ ruby --disable-gems --jit-verbose=2 --jit-save-temps --jit-wait --jit-min-calls=2 -e "
def multiply(a, b)
a *= b
end
3.times do
p multiply(7.0, 10.0)
end
"
In my case, they were /home/k0kubun/.rbenv/versions/ruby/include/ruby-2.7.0/x86_64-linux/rb_mjit_min_header-2.7.0.h
, /tmp/_ruby_mjit_p17484u0.c
, and /tmp/_ruby_mjit_p17484u1.c
. Also the output of ls -la /tmp
after that may be also helpful.
Updated by vo.x (Vit Ondruch) about 5 years ago
I wish this was easier to debug. The problem is that this is test failure and it happens on build system, where I don't have access. Trying to reproduce it on my system, this does not work:
$ echo "
def multiply(a, b)
a *= b
end
3.times do
p multiply(7.0, 10.0)
end
" > test.rb
$ make runruby TESTRUN_SCRIPT="--disable-gems --jit-verbose=2 --jit-save-temps --jit-wait --jit-min-calls=2 test.rb"
./revision.h unchanged
./miniruby -I./lib -I. -I.ext/common ./tool/runruby.rb --extout=.ext -- --disable-gems --disable-gems --jit-verbose=2 --jit-save-temps --jit-wait --jit-min-calls=2 test.rb
MJIT: CC defaults to /usr/bin/gcc
MJIT: tmp_dir is /tmp
Cannot access header file: /usr/include/rb_mjit_min_header-2.7.0.h
Failure in MJIT header file name initialization
70.0
70.0
70.0
I have to probably patch the test case to provide me with the output of the files :/
Updated by k0kubun (Takashi Kokubun) about 5 years ago
- Status changed from Assigned to Feedback
I see. Thanks for the information. At this moment I cannot do anything either, so I'll wait for you to collect the information from the CI system somehow.
Updated by mame (Yusuke Endoh) about 5 years ago
FYI: RubyCI platforms include RHEL 7.1 s390x and Ubuntu armv8 (aarch64), and their results are both green at the present time. So the cause would be the other factor than CPU, I guess.
Updated by vo.x (Vit Ondruch) about 5 years ago
- File build-s390x.log build-s390x.log added
- File build-armv7hl.log build-armv7hl.log added
- File build-x86_64.log build-x86_64.log added
- Subject changed from `TestJIT#test_block_handler_with_possible_frame_omitted_inlining` fails on s390x and aarch64 to `TestJIT#test_block_handler_with_possible_frame_omitted_inlining` fails on s390x and armv7hl
So this is my hacked up test case:
$ git diff
diff --git a/test/ruby/test_jit.rb b/test/ruby/test_jit.rb
index 08494cbbbb..9ace7754d4 100644
--- a/test/ruby/test_jit.rb
+++ b/test/ruby/test_jit.rb
@@ -944,9 +944,15 @@ def assert_compile_once(script, result_inspect:, insns: [])
end
# Shorthand for normal test cases
- def assert_eval_with_jit(script, stdout: nil, success_count:, min_calls: 1, insns: [], uplevel: 3)
- out, err = eval_with_jit(script, verbose: 1, min_calls: min_calls)
+ def assert_eval_with_jit(script, stdout: nil, success_count:, min_calls: 2, insns: [], uplevel: 3)
+ out, err = eval_with_jit(script, verbose: 2, min_calls: min_calls, save_temps: true)
actual = err.scan(/^#{JIT_SUCCESS_PREFIX}:/).size
+ puts "", "**********", "* rb_mjit_min_header-2.7.0.h", "---", ""
+ $stdout.flush
+ puts File.read(".ext/include/x86_64-linux/rb_mjit_min_header-2.7.0.h")
+ # puts File.read(".ext/include/armv7hl-linux/rb_mjit_min_header-2.7.0.h")
+ # puts File.read(".ext/include/s390x-linux/rb_mjit_min_header-2.7.0.h")
+ Dir.glob('/tmp/*.c').each {|f| puts '**********', "* #{f}", "", File.read(f), "---"; $stdout.flush}
# Add --jit-verbose=2 logs for cl.exe because compiler's error message is suppressed
# for cl.exe with --jit-verbose=1. See `start_process` in mjit_worker.c.
if RUBY_PLATFORM.match?(/mswin/) && success_count != actual
And I run just the single test:
make test-all TESTS="test/ruby/test_jit.rb -n /test_block_handler_with_possible_frame_omitted_inlining/"
See the attached logs from s390x, armv7hl and x86_64 (apologies, some of the lines might be slightly intermingled but the build system, but I hope you can handle that).
BTW I was wrong saying that it fails on AArch64, because it actually fails on armv7hl
Updated by k0kubun (Takashi Kokubun) about 5 years ago
- File mjit_debug.diff mjit_debug.diff added
Thank you. All of the information help me a lot.
It seems that the command line construction is broken for the second compilation in build-armv7hl.log and build-s390x.log, while build-x86_64 seems okay. In this ticket, I attached "mjit_debug.diff" to collect more information on your build environments again. Could you share the build logs with it?
Updated by vo.x (Vit Ondruch) about 5 years ago
- File build-x86_64.log build-x86_64.log added
- File build-s390x.log build-s390x.log added
- File build-armv7hl.log build-armv7hl.log added
Here are the logs (bit messy again, but I hope you can get the information).
Updated by vo.x (Vit Ondruch) about 5 years ago
BTW a bit OT, but seeing all the information stored in the rb_mjit_min_header-2.7.0.h, I am not sure the JIT will work for binary distributions such as Fedora/RHEL. There appears to be embedded a lot of information about the machine used for build, while the JIT has to run on quite different machine.
Updated by k0kubun (Takashi Kokubun) about 5 years ago
- File mjit_debug2.diff mjit_debug2.diff added
Thank you for the next information. Could you also test the new "mjit_debug2.diff" which I attached now in the same way?
BTW a bit OT, but seeing all the information stored in the rb_mjit_min_header-2.7.0.h, I am not sure the JIT will work for binary distributions such as Fedora/RHEL.
MJIT's support policy is that the compiler for runtime MJIT compilation and its path must be the same as one used to build Ruby binary. Otherwise it's just out of support. Even if it's a binary distribution, you could also distribute a compiler as needed.
Updated by k0kubun (Takashi Kokubun) about 5 years ago
- Status changed from Feedback to Closed
Applied in changeset git|d8cc41c43be65dd4b17e7a6e38f5a7fdf2b247d6.
Fix a wrong buffer size to avoid stack corruption
[Bug #15986]
Updated by k0kubun (Takashi Kokubun) about 5 years ago
Fortunately a very similar issue was reproductive on my macOS machine. I did the mjit_debug2.diff investigation on my own, and noticed the issue fixed by d8cc41c43be65dd4b17e7a6e38f5a7fdf2b247d6. And the commit fixed the behavior on my machine. So I hope it's fixed on your environment too.
Updated by vo.x (Vit Ondruch) about 5 years ago
Here are the logs again. I am going to try the latest master and will report back if that helps.
Updated by vo.x (Vit Ondruch) about 5 years ago
- File build-armv7hl.log build-armv7hl.log added
- File build-s390x.log build-s390x.log added
- File build-x86_64.log build-x86_64.log added
Updated by vo.x (Vit Ondruch) about 5 years ago
I did several build with 0c6c937904 and all passed. Thx for the fix.