Project

General

Profile

Actions

Bug #19254

closed

Enabling YJIT configuration option breaks rspec-core test suite

Added by vo.x (Vit Ondruch) over 1 year ago. Updated about 1 year ago.

Status:
Third Party's Issue
Assignee:
-
Target version:
-
ruby -v:
ruby 3.2.0dev (2022-12-23 master c5eefb7f37) [x86_64-linux]
[ruby-core:111400]

Description

In preparation for Ruby 3.2, we have enabled YJIT in Fedora:

https://src.fedoraproject.org/rpms/ruby/c/3c1be9f9c2c1d8679eebb9a185fefa15baa1bcfc?branch=private-ruby-3.2

Since that moment, rspec-core test suite started to fail (see the attached log for all details):

... snip ...

  1) RSpec::Core::Example#run memory leaks, see GH-321, GH-1921 releases references to the examples / their ivars
     Failure/Error: expect(get_all.call).to eq opts.fetch(:post_gc)

       expected: []
            got: ["after_all", "before_all"]

       (compared using ==)
     # ./spec/rspec/core/example_spec.rb:469:in `expect_gc'
     # ./spec/rspec/core/example_spec.rb:492:in `block (4 levels) in <top (required)>'
     # ./spec/support/sandboxing.rb:16:in `block (3 levels) in <top (required)>'
     # ./spec/support/sandboxing.rb:7:in `block (2 levels) in <top (required)>'

Finished in 8.98 seconds (files took 0.47612 seconds to load)
2209 examples, 1 failure, 4 pending

Please note that the YJIT was not enabled during runtime, just the support was enabled. Disabling the YJIT supports makes the test case pass.


Files

builder-live.log.gz (28.7 KB) builder-live.log.gz The fuild build log vo.x (Vit Ondruch), 12/23/2022 03:17 PM

Updated by mame (Yusuke Endoh) over 1 year ago

You mean this test?

https://github.com/rspec/rspec-core/blob/522b7727d02d9648c090b56fa68bbdc18a21c04d/spec/rspec/core/example_spec.rb#L444-L496

Frankly speaking, this test appears to be completely wrong. MRI's GC is not exact (in their terms, not reliable).

Updated by k0kubun (Takashi Kokubun) over 1 year ago

  • Status changed from Open to Feedback

In addition to @mame (Yusuke Endoh) 's point, can you report how to reproduce the issue by building Ruby from a source or a tarball? e.g.

$ git clone --depth=1 https://github.com/ruby/ruby
$ cd ruby
$ ./autogen.sh
$ ./configure --enable-yjit --prefix="/opt/rubies/ruby" && make -j8 && make install

$ git clone --depth=1 https://github.com/rspec/rspec-core
$ cd rspec-core
$ unset GEM_ROOT GEM_HOME GEM_PATH
$ export PATH="/opt/rubies/ruby/bin:${PATH}"
$ bundle install
$ bundle exec rspec spec/rspec/core/example_spec.rb

And it doesn't reproduce any problem.

$ cd rspec-core
$ git rev-parse HEAD
522b7727d02d9648c090b56fa68bbdc18a21c04d
$ ruby -v -e "p RbConfig::CONFIG['YJIT_SUPPORT']"
ruby 3.2.0dev (2022-12-23T17:24:55Z master ee60756495) [x86_64-linux]
"yes"
$ RUBYOPT=-v bundle exec rspec spec/rspec/core/example_spec.rb:472
ruby 3.2.0dev (2022-12-23T17:24:55Z master ee60756495) [x86_64-linux]
Run options:
  include {:locations=>{"./spec/rspec/core/example_spec.rb"=>[472]}}
  exclude {:ruby=>#<Proc: ./spec/spec_helper.rb:110>}

Randomized with seed 37258

RSpec::Core::Example
  #run
    memory leaks, see GH-321, GH-1921
      releases references to the examples / their ivars

Finished in 0.0101 seconds (files took 0.09802 seconds to load)
1 example, 0 failures

Randomized with seed 37258

Updated by vo.x (Vit Ondruch) over 1 year ago

mame (Yusuke Endoh) wrote in #note-1:

You mean this test?

https://github.com/rspec/rspec-core/blob/522b7727d02d9648c090b56fa68bbdc18a21c04d/spec/rspec/core/example_spec.rb#L444-L496

Yes, sorry, forgot to attach the link.

k0kubun (Takashi Kokubun) wrote in #note-2:

In addition to @mame (Yusuke Endoh) 's point, can you report how to reproduce the issue by building Ruby from a source or a tarball?

The build was done via RPMs. Ruby was built form tarball. Here is the full Ruby build log:

https://download.copr.fedorainfracloud.org/results/vondruch/ruby-3.2/fedora-rawhide-x86_64/05176885-ruby/builder-live.log.gz

Working on #19248, I suspect that some of the compiler options might help to reproduce this.

Updated by mtasaka (Mamoru TASAKA) about 1 year ago

Looks like adding %global _lto_cflags %{nil} to ruby.spec, i.e. removing -flto=auto -ffat-lto-objects from compilation flag makes the above rspec-core test pass (note that Fedora ruby is using gcc).

So maybe LTO is doing "something" with yjit.

Updated by vo.x (Vit Ondruch) about 1 year ago

k0kubun (Takashi Kokubun) wrote in #note-2:

$ RUBYOPT=-v bundle exec rspec spec/rspec/core/example_spec.rb:472

I have not hit the issue trying to run just this minimal example

Updated by alanwu (Alan Wu) about 1 year ago

I agree with mame that the test is highly questionable.
The GC does not guarantee collection for all semantically unreachable objects since it's not exact.
Because we scan the native stack for conservative marking, changes in code generation could
spill different objects to the native stack and keep them alive. This is probably what we're seeing
through the combination of building YJIT + LTO, but not enabling YJIT at runtime.

We could take a heap dump (ObjectSpace.dump_all) and verify that indeed the objects are kept alive through the machine context, but beyond that, I don't think there is much to do here.

Actions #7

Updated by hsbt (Hiroshi SHIBATA) about 1 year ago

  • Status changed from Feedback to Third Party's Issue
Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0Like0Like0Like0