Project

General

Profile

Actions

Bug #19363

open

Fix rb_transient_heap_mark: wrong header (T_STRUCT) segfault

Bug #19363: Fix rb_transient_heap_mark: wrong header (T_STRUCT) segfault

Added by bkuhlmann (Brooke Kuhlmann) over 2 years ago. Updated over 1 year ago.

Status:
Assigned
Target version:
-
ruby -v:
ruby 3.2.0 (2022-12-25 revision a528908271) +YJIT [arm64-darwin22.2.0]
[ruby-core:111956]

Description

Overview

Hello. 👋 I'm hitting an issue where my build is constantly failing with a segfault. The following is a snippet taken from my local machine with YJIT enabled (see attachments for details):

/Users/bkuhlmann/.cache/frum/versions/3.2.0/lib/ruby/gems/3.2.0/gems/puma-6.0.2/lib/puma/runner.rb: [BUG] rb_transient_heap_mark: wrong header, T_STRUCT (0x0000000109ea98a0)
ruby 3.2.0 (2022-12-25 revision a528908271) +YJIT [arm64-darwin22.2.0]

The closest issue I could find that might be related to this issue (but not sure) is this issue: #15358.

Steps to Recreate

You should be able to quickly recreate this issue via these steps:

  • Download/clone my Hemo project.
  • Run the setup steps.
  • Run the test suite by running bin/rspec.

If you need an example of the same segfault (but not on my macOS machine), you can see the same segfault via my Circle CI Build. My Circle CI build is using my Docker Alpine Linux Ruby image which might be of interest as well. This Docker image is also built with YJIT enabled.

Interestingly, is if you were to run the test suite with bin/guard instead of bin/rspec then the segfault doesn't occur.

Environment

ruby 3.2.0 (2022-12-25 revision a528908271) +YJIT [arm64-darwin22.2.0]

1.43.0 (using Parser 3.2.0.0, rubocop-ast 1.24.1, running on ruby 3.2.0) [arm64-darwin22.2.0]
  - rubocop-performance 1.15.2
  - rubocop-rake 0.6.0
  - rubocop-rspec 2.18.1
  - rubocop-sequel 0.3.4
  - rubocop-thread_safety 0.4.4

Files

segfault.txt (237 KB) segfault.txt Segfault bkuhlmann (Brooke Kuhlmann), 01/21/2023 06:41 PM
ruby-2023-01-21-113841.ips (19.6 KB) ruby-2023-01-21-113841.ips Translated Report bkuhlmann (Brooke Kuhlmann), 01/21/2023 06:42 PM
segv.log (14.7 KB) segv.log wanabe (_ wanabe), 03/26/2023 10:35 AM

Updated by alanwu (Alan Wu) over 2 years ago Actions #1 [ruby-core:111958]

Thanks for the report, and for the comprehensive reproduction steps.
Triage note, it seems this issue can happen on x86_64-linux-musl without YJIT at runtime.
The crash logs there in the Circle CI build does not say +YJIT.

Updated by bkuhlmann (Brooke Kuhlmann) over 2 years ago Actions #2 [ruby-core:112103]

I was able to narrow down where this bug is occuring. Turns out that when enabling eval in SimpleCov, the segfault consistently happens. Here's the code in question as found in the spec_helper.rb of the above application:

unless ENV["NO_COVERAGE"]
  SimpleCov.start do
    add_filter %r(^/spec/)
    enable_coverage :branch
    enable_coverage_for_eval  # <-- When this is enabled, the segmentation fault consistently occurs.
    minimum_coverage_by_file line: 95, branch: 95
  end
end

If the SimpleCov enable_coverage_for_eval statement is removed entirely, then there is no segmentation fault.

Updated by wanabe (_ wanabe) over 2 years ago Actions #3 [ruby-core:113020]

I made a short reproduction code.
There are three points:

  • unexpected negative lineno for eval (or for class_eval)
  • Coverage.start with lines: true, eval: true
  • GC verification
require "coverage"

Coverage.start(lines: true, eval: true)
eval(<<~EOS, binding, "", -1)
  Kernel.module_eval do
    def bar(locals)
      bar = locals[:bar]
    end
  end
EOS
bar({})
GC.verify_compaction_references

And I attached SEGV log on ruby 3.3.0dev (2023-03-26T06:23:11Z master 2f916812a9) [x86_64-linux] + WSL2.

Updated by wanabe (_ wanabe) over 2 years ago Actions #4 [ruby-core:113023]

  • Assignee set to mame (Yusuke Endoh)

I guess that update_line_coverage() does not assume negative line numbers.
https://bugs.ruby-lang.org/projects/ruby-master/repository/git/revisions/v3_2_1/entry/thread.c#L5512

(I think ideally eval should reject negative line numbers.
However, it is not a bug but feature request, and it may be compatibility issues.)

mame-san, can you have a look at the issue?
Please allow me to assign it to you.

Updated by jeremyevans0 (Jeremy Evans) over 2 years ago Actions #5 [ruby-core:113025]

wanabe (_ wanabe) wrote in #note-4:

(I think ideally eval should reject negative line numbers.
However, it is not a bug but feature request, and it may be compatibility issues.)

tilt (~445 million downloads, dependency of Sinatra) uses eval with negative line numbers, so that the line numbers reported in back traces match the lines in the user-provided template file, even though tilt adds lines before those lines. There would be significant backwards compatibility issues if negative line numbers were removed.

Updated by Eregon (Benoit Daloze) over 2 years ago Actions #6 [ruby-core:113174]

IMO it would be good to deprecate negative line numbers, they are a hack that's pretty ugly to replicate in Ruby implementations.
Tilt could just use prelude_code; original_first_line, no?

Updated by byroot (Jean Boussier) over 2 years ago Actions #7 [ruby-core:113175]

@Eregon (Benoit Daloze) that doesn't work if your prelude contains magic comments, typically # frozen_string_literal: true

Updated by Eregon (Benoit Daloze) over 2 years ago Actions #8 [ruby-core:113176]

Ah, yes. Maybe eval could accept such magic comments as keyword arguments or so to do it more cleanly.
eval already does a bit of that by using the String's encoding as source encoding (if no encoding magic comment used).

Updated by hsbt (Hiroshi SHIBATA) over 1 year ago Actions #9

  • Status changed from Open to Assigned
Actions

Also available in: PDF Atom