Bug #21220: Memory corruption in update_line_coverage() [write at index -1] - Ruby - Ruby Issue Tracking System

Actions

Copy link

Bug #21220

closed

Memory corruption in update_line_coverage() [write at index -1]

Bug #21220: Memory corruption in update_line_coverage() [write at index -1]

Added by mbcodeandsound (Mike Bourgeous) 11 months ago. Updated 9 months ago.

Status:

Closed

Assignee:

Target version:

ruby -v:

ruby 3.4.2 (2025-02-15 revision d2930f8e7a) +PRISM [x86_64-linux]

Backport:

3.2: WONTFIX, 3.3: DONE, 3.4: DONE

[ruby-core:121556]

Description

Hello!

I have encountered repeatable memory corruption in Ruby 3.4.2 on Ubuntu 24.04.2 LTS, which I believe is happening in update_line_coverage(). I could not reproduce this on Ruby 3.x or earlier. My findings follow. I also have detailed step-by-step notes at https://github.com/mike-bourgeous/mb-sound/issues/36

Summary¶

update_line_coverage() calls rb_sourceline(), subtracts one from its return value, and uses this as an index into an Array. Sometimes rb_sourceline() returns 0, and when this happens, update_line_coverage() will write to index -1 of the array. This corrupts the heap before the Array, resulting in a program crash later during GC.

As I am new to the Ruby codebase I do not know if it's normal for rb_sourceline() to return 0 and update_line_coverage() should handle it, or if something is wrong in the code that ultimately feeds rb_sourceline().

Symptom¶

On Linux, affected processes print one of the following errors and exit:

munmap_chunk(): invalid pointer
Aborted (core dumped)

or, if preloading libc_malloc_debug.so

malloc_check_get_size: memory corruption
Aborted (core dumped)

Reproduction¶

I have a reduced GitHub project that can reproduce the bug consistently both on my machine and in CI. When I try to reduce the size of this repo further, the bug stops happening.

The issue only reproduces locally if the coverage/ directory has a large .resultset.json.

Repo: https://github.com/mike-bourgeous/reproduce-simplecov-ruby34-bug
Example of the bug: https://github.com/mike-bourgeous/reproduce-simplecov-ruby34-bug/actions/runs/14289657889/job/40049195631#step:5:176

# Repeatedly running the process increases the likelihood of crashing
# as the SimpleCov result file grows.
for f in `seq 1 100`; do echo $f; ruby -r./spec/simplecov_helper.rb bin/midi_roll.rb -c 40 -r 2 spec/test_data/all_notes.mid > /dev/null || break ; done

Research and reasoning¶

I initially found the crash during a live stream when I was upgrading a project from Ruby 2.7 to Ruby 3.4. The crash occurred when an RSpec test tried to spawn another Ruby process, while using SimpleCov to measure code coverage in both. I discovered a workaround of disabling SimpleCov in the nested process when running tests on Ruby 3.4. I used a somewhat unusual approach to get coverage metrics for subprocesses.

After the stream I wanted to understand what was really happening and see if I could find a way to re-enable test code coverage for subprocesses. I used a combination of Valgrind, GDB, and trial and error to narrow down the site of the crash and the original corruption. I wrote a GDB script to automate information gathering when the GC crash occurred, and Valgrind+vgdb to identify the original write that appeared to cause the corruption.

I reviewed the Git history of update_line_coverage(), rb_sourceline() (and the functions it calls), and a few other functions, but did not find any obvious changes between Ruby 3.3.x and Ruby 3.4.x, so the root cause is somewhere beyond my familiarity with the codebase.

Full details of my process are in my issue notes: https://github.com/mike-bourgeous/mb-sound/issues/36

Files

Download all files

corruption_c_stack.txt (2.63 KB) corruption_c_stack.txt		mbcodeandsound (Mike Bourgeous), 04/07/2025 05:26 PM
corruption_ruby_stack.txt (948 Bytes) corruption_ruby_stack.txt		mbcodeandsound (Mike Bourgeous), 04/07/2025 05:26 PM
crash_ruby_stack.txt (4.46 KB) crash_ruby_stack.txt		mbcodeandsound (Mike Bourgeous), 04/07/2025 05:26 PM
crash_c_stack.txt (26.2 KB) crash_c_stack.txt		mbcodeandsound (Mike Bourgeous), 04/07/2025 05:26 PM

Related issues 1 (0 open — 1 closed)

Updated by mbcodeandsound (Mike Bourgeous) 11 months ago Actions
Copy link
#1 [ruby-core:121557]

Something like this should prevent the memory corruption, but may be hiding a deeper issue:

--- thread.c    2025-02-14 14:25:54.000000000 -0700
+++ thread_fix.c        2025-04-07 11:32:53.571115993 -0600
@@ -5675,7 +5675,7 @@
                 rb_ary_push(lines, LONG2FIX(line + 1));
                 return;
             }
-            if (line >= RARRAY_LEN(lines)) { /* no longer tracked */
+            if (line < 0 || line >= RARRAY_LEN(lines)) { /* no longer tracked */
                 return;
             }
             num = RARRAY_AREF(lines, line);

Updated by byroot (Jean Boussier) 11 months ago Actions
Copy link
#2 [ruby-core:121573]

I do not know if it's normal for rb_sourceline() to return 0

Yes:


const char *
rb_sourcefile(void)
{
    const rb_execution_context_t *ec = GET_EC();
    const rb_control_frame_t *cfp = rb_vm_get_ruby_level_next_cfp(ec, ec->cfp);

    if (cfp) {
        return RSTRING_PTR(rb_iseq_path(cfp->iseq));
    }
    else {
        return 0;
    }
}

If there is no Ruby frame yet (only C frames) you get zero.

You patch seems OK, even though I think it would be simpler to return right after the rb_sourcefile() call.

Updated by mame (Yusuke Endoh) 11 months ago Actions
Copy link
#3 [ruby-core:121577]

Thanks for the report. The issue reproduces on my machine, but it's very rare. I haven't figured out what happens.

byroot (Jean Boussier) wrote in #note-2:

If there is no Ruby frame yet (only C frames) you get zero.

That's true, but I was assuming that RUBY_EVENT_COVERAGE_LINE would never occur at such a timing.
I am OK to merge the proposed patch because it looks benign, but I am still curious how the issue itself happens.

Updated by byroot (Jean Boussier) 11 months ago Actions
Copy link
#4 [ruby-core:121578]

The issue reproduces on my machine, but it's very rare.

You can make it happen 100% of the time with the following patch:

diff --git a/thread.c b/thread.c
index 661ee977a9..247fe7237f 100644
--- a/thread.c
+++ b/thread.c
@@ -5662,6 +5662,9 @@ update_line_coverage(VALUE data, const rb_trace_arg_t *trace_arg)
         VALUE lines = RARRAY_AREF(coverage, COVERAGE_INDEX_LINES);
         if (lines) {
             long line = rb_sourceline() - 1;
+            if (line < 0) {
+                rb_bug("fail");
+            }
             long count;
             VALUE num;
             void rb_iseq_clear_event_flags(const rb_iseq_t *iseq, size_t pos, rb_event_flag_t reset);

It's rare because it only crashes if GC trigger which doesn't always happen.

I'm using that patch to try to come up with a reproduction (but you are welcome to dig further too, there's no guarantee I will find the repro).

Updated by byroot (Jean Boussier) 11 months ago Actions
Copy link
#5 [ruby-core:121579]

What I've figured for now it that the corruption is triggered by:

  # This require line makes sure the original script file is processed by simplecov
  require File.expand_path($0, '.')

    frame #4: 0x00000001003c468c ruby`rb_bug(fmt=<unavailable>) at error.c:1117:5
    frame #5: 0x00000001003c57dc ruby`update_line_coverage.cold.1 at thread.c:5666:17
    frame #6: 0x000000010021ae64 ruby`update_line_coverage(data=<unavailable>, trace_arg=<unavailable>) at thread.c:5666:17
    frame #7: 0x0000000100277fe0 ruby`exec_hooks_unprotected [inlined] exec_hooks_body(ec=0x0000000141f05bf0, list=0x0000000141f05970, trace_arg=0x000000016fdfc4c8) at vm_trace.c:352:17
    frame #8: 0x0000000100277fc0 ruby`exec_hooks_unprotected(ec=0x0000000141f05bf0, list=0x0000000141f05970, trace_arg=0x000000016fdfc4c8) at vm_trace.c:381:5
    frame #9: 0x0000000100277f30 ruby`rb_exec_event_hooks(trace_arg=<unavailable>, hooks=0x0000000141f05970, pop_p=0) at vm_trace.c:427:13
    frame #10: 0x000000010026c01c ruby`vm_trace_hook [inlined] rb_exec_event_hook_orig(ec=0x0000000141f05bf0, hooks=0x0000000141f05970, flag=65536, self=4302425520, id=0, called_id=0, klass=0, data=36, pop_p=0) at vm_core.h:2179:5
    frame #11: 0x000000010026bfe8 ruby`vm_trace_hook(ec=0x0000000141f05bf0, reg_cfp=0x0000000148127e78, pc=<unavailable>, pc_events=65537, target_event=65536, global_hooks=0x0000000141f05970, local_hooks_ptr=0x000000011b4d4ca8, val=36) at vm_insnhelper.c:7062:9
    frame #12: 0x000000010026b9e4 ruby`vm_trace(ec=0x0000000141f05bf0, reg_cfp=0x0000000148127e78) at vm_insnhelper.c:7170:13
    frame #13: 0x000000010024b128 ruby`vm_exec_core(ec=<unavailable>) at vm.inc:4972:5
    frame #14: 0x0000000100249860 ruby`rb_vm_exec(ec=0x0000000141f05bf0) at vm.c:2597:22
    frame #15: 0x000000010025e978 ruby`rb_iseq_eval(iseq=<unavailable>) at vm.c:2852:11 [artificial]
    frame #16: 0x0000000100123298 ruby`load_iseq_eval(ec=0x0000000141f05bf0, fname=4753022720) at load.c:789:5
    frame #17: 0x0000000100121350 ruby`require_internal(ec=0x0000000141f05bf0, fname=4757156480, exception=1, warn=<unavailable>) at load.c:1297:21
    frame #18: 0x0000000100120608 ruby`rb_require_string_internal(fname=4757156480, resurrect=false) at load.c:1403:22
    frame #19: 0x00000001001204d8 ruby`rb_f_require [inlined] rb_require_string(fname=4757156480) at load.c:1389:12
    frame #20: 0x00000001001204b8 ruby`rb_f_require(obj=<unavailable>, fname=<unavailable>) at load.c:1029:12
    frame #21: 0x0000000100268894 ruby`vm_call_cfunc_with_frame_(ec=0x0000000141f05bf0, reg_cfp=0x0000000148127ee8, calling=<unavailable>, argc=1, argv=0x00000001480280c0, stack_bottom=0x00000001480280b8) at vm_insnhelper.c:3794:11
    frame #22: 0x0000000100263af8 ruby`vm_call_alias(ec=<unavailable>, cfp=<unavailable>, calling=0x000000016fdfd140) at vm_insnhelper.c:4181:12
    frame #23: 0x000000010024cf4c ruby`vm_exec_core [inlined] vm_sendish(ec=0x0000000141f05bf0, reg_cfp=0x0000000148127ee8, cd=0x00006000022b3110, block_handler=0, method_explorer=mexp_search_method) at vm_insnhelper.c:5964:15
    frame #24: 0x000000010024ce48 ruby`vm_exec_core(ec=<unavailable>) at insns.def:898:11
    frame #25: 0x0000000100249860 ruby`rb_vm_exec(ec=0x0000000141f05bf0) at vm.c:2597:22
    frame #26: 0x000000010025e978 ruby`rb_iseq_eval(iseq=<unavailable>) at vm.c:2852:11 [artificial]
    frame #27: 0x0000000100123298 ruby`load_iseq_eval(ec=0x0000000141f05bf0, fname=4755544040) at load.c:789:5
    frame #28: 0x0000000100121350 ruby`require_internal(ec=0x0000000141f05bf0, fname=4726729080, exception=1, warn=<unavailable>) at load.c:1297:21

But I wasn't able to reduce it yet.

Updated by mame (Yusuke Endoh) 11 months ago · Edited Actions
Copy link
#6 [ruby-core:121580]

Status changed from Open to Assigned
Assignee set to prism

@byroot (Jean Boussier) Thanks! I think this is an issue of the prism compiler handling the line number around a flip-flop syntax.

test.rb

TracePoint.new(:line) do |tp|
  p tp
end.enable

load "./target.rb"

target.rb

if true..true
end

$ ~/work/ruby/local/bin/ruby --parser=prism test.rb
#<TracePoint:line test.rb:5>
#<TracePoint:line /tmp/reproduce-simplecov-ruby34-bug/target.rb:0> # <=== This lineno should be 1

$ ~/work/ruby/local/bin/ruby --parser=parse.y test.rb
#<TracePoint:line test.rb:5>
#<TracePoint:line /tmp/reproduce-simplecov-ruby34-bug/target.rb:1>

Updated by byroot (Jean Boussier) 11 months ago Actions
Copy link
#7 [ruby-core:121581]

Nice find!

Can source line ever legitimately be 0? Perhaps we are missing some assertions that would have caught this earlier? And perhaps there are other syntax causing the same issue.

Updated by mame (Yusuke Endoh) 11 months ago 1Actions
Copy link
#8 [ruby-core:121582]

@mbcodeandsound (Mike Bourgeous) Just FYI, I bet you meant to write !(1..16).cover?(channel) in the following line.

https://github.com/mike-bourgeous/reproduce-simplecov-ruby34-bug/blob/d73c3fe80014cb91d8b6c64847581feb8a19d1b6/bin/midi_roll.rb#L42

Thank goodness for us, because it resulted in the discovery of a bug in Ruby :-)

Updated by mame (Yusuke Endoh) 11 months ago Actions
Copy link
#9 [ruby-core:121583]

byroot (Jean Boussier) wrote in #note-7:

Can source line ever legitimately be 0?

With eval, it can. It can be even negative.

$ ruby -e 'eval("p __LINE__", nil, "foo", -10000)'
-10000

And when I did this under Coverage.start(eval: true), it does [BUG] :-P

require 'coverage'

Coverage.start(eval: true, lines: true)

eval(<<END, nil, "foo", -1)
p
p
END

$ ~/work/ruby/local/bin/ruby test.rb
foo:-1: [BUG] fail
ruby 3.5.0dev (2025-04-06T03:24:23Z master e25889951f) +PRISM [x86_64-linux]

We can prevent RUBY_EVENT_COVERAGE_LINE from firing in contexts where lineno is negative or zero, but that doesn't happen very often.
So I guess I should merge the proposed patch as a precaution?

Updated by byroot (Jean Boussier) 11 months ago Actions
Copy link
#10 [ruby-core:121584]

So I guess I should merge the proposed patch as a precaution?

I think so yes, with that eval code as a test case.

Updated by mbcodeandsound (Mike Bourgeous) 11 months ago · Edited Actions
Copy link
#11 [ruby-core:121599]

mame (Yusuke Endoh) wrote in #note-8:

@mbcodeandsound (Mike Bourgeous) Just FYI, I bet you meant to write !(1..16).cover?(channel) in the following line.

https://github.com/mike-bourgeous/reproduce-simplecov-ruby34-bug/blob/d73c3fe80014cb91d8b6c64847581feb8a19d1b6/bin/midi_roll.rb#L42

Thank goodness for us, because it resulted in the discovery of a bug in Ruby :-)

Yes I did, thanks (0..15 as well rather than 1..16)! I didn't even know about the flip-flop operator, or its controversial almost removal, until today. I also did not expect to encounter a parser bug on top of a coverage-related bug. I'm glad my program's bug uncovered some bugs in Ruby either way, and it's also fun that the bug was on line 42.

Ruby continues to be my favorite language. Thanks for what you do!

Updated by mame (Yusuke Endoh) 11 months ago Actions
Copy link
#12 [ruby-core:121604]

I created a PR.

https://github.com/ruby/ruby/pull/13089

The PR should prevent RUBY_EVENT_COVERAGE_LINE from firing on non-positive line numbers. (The fix looked easier than I thought.)

I will create another ticket for the bug of the prism compiler.

Updated by mame (Yusuke Endoh) 11 months ago Actions
Copy link
#13

Related to Bug #21259: The Prism compiler wrongly creates a line number of zero added

Updated by byroot (Jean Boussier) 11 months ago Actions
Copy link
#14

Backport changed from 3.2: UNKNOWN, 3.3: UNKNOWN, 3.4: UNKNOWN to 3.2: WONTFIX, 3.3: REQUIRED, 3.4: REQUIRED

Updated by byroot (Jean Boussier) 11 months ago Actions
Copy link
#15 [ruby-core:121606]

Assignee deleted (~~prism~~)

Updated by mame (Yusuke Endoh) 11 months ago Actions
Copy link
#16

Status changed from Assigned to Closed

Applied in changeset git|0d6263bd416338a339651fb97fe4d62701704c4b.

Fix coverage measurement for negative line numbers

Fixes [Bug #21220]

Co-Authored-By: Mike Bourgeous mike@mikebourgeous.com
Co-Authored-By: Jean Boussier jean.boussier@gmail.com

Updated by k0kubun (Takashi Kokubun) 11 months ago Actions
Copy link
#17 [ruby-core:121652]

Backport changed from 3.2: WONTFIX, 3.3: REQUIRED, 3.4: REQUIRED to 3.2: WONTFIX, 3.3: REQUIRED, 3.4: DONE

ruby_3_4 1b6f9cd6ea5715ad33d8ba4c78382aa8aafdbcd4 merged revision(s) 0d6263bd416338a339651fb97fe4d62701704c4b.

Updated by nagachika (Tomoyuki Chikanaga) 9 months ago Actions
Copy link
#18 [ruby-core:122163]

Backport changed from 3.2: WONTFIX, 3.3: REQUIRED, 3.4: DONE to 3.2: WONTFIX, 3.3: DONE, 3.4: DONE

ruby_3_3 b1b6752fbeb2d23dbea639bd4b331c9e8b56f49c merged revision(s) 0d6263bd416338a339651fb97fe4d62701704c4b.

Actions

Copy link

Also available in: PDF Atom

Project

General

Profile

Ruby

Custom queries

Bug #21220

Memory corruption in update_line_coverage() [write at index -1]

Summary¶

Symptom¶

Reproduction¶

Research and reasoning¶

Updated by mbcodeandsound (Mike Bourgeous) 11 months ago Actions
Copy link
#1 [ruby-core:121557]

Updated by byroot (Jean Boussier) 11 months ago Actions
Copy link
#2 [ruby-core:121573]

Updated by mame (Yusuke Endoh) 11 months ago Actions
Copy link
#3 [ruby-core:121577]

Updated by byroot (Jean Boussier) 11 months ago Actions
Copy link
#4 [ruby-core:121578]

Updated by byroot (Jean Boussier) 11 months ago Actions
Copy link
#5 [ruby-core:121579]

Updated by mame (Yusuke Endoh) 11 months ago · Edited Actions
Copy link
#6 [ruby-core:121580]

Updated by byroot (Jean Boussier) 11 months ago Actions
Copy link
#7 [ruby-core:121581]

Updated by mame (Yusuke Endoh) 11 months ago 1Actions
Copy link
#8 [ruby-core:121582]

Updated by mame (Yusuke Endoh) 11 months ago Actions
Copy link
#9 [ruby-core:121583]

Updated by byroot (Jean Boussier) 11 months ago Actions
Copy link
#10 [ruby-core:121584]

Updated by mbcodeandsound (Mike Bourgeous) 11 months ago · Edited Actions
Copy link
#11 [ruby-core:121599]

Updated by mame (Yusuke Endoh) 11 months ago Actions
Copy link
#12 [ruby-core:121604]

Updated by mame (Yusuke Endoh) 11 months ago Actions
Copy link
#13

Updated by byroot (Jean Boussier) 11 months ago Actions
Copy link
#14

Updated by byroot (Jean Boussier) 11 months ago Actions
Copy link
#15 [ruby-core:121606]

Updated by mame (Yusuke Endoh) 11 months ago Actions
Copy link
#16

Updated by k0kubun (Takashi Kokubun) 11 months ago Actions
Copy link
#17 [ruby-core:121652]

Updated by nagachika (Tomoyuki Chikanaga) 9 months ago Actions
Copy link
#18 [ruby-core:122163]

Project

General

Profile

Ruby

Custom queries

Bug #21220

Memory corruption in update_line_coverage() [write at index -1]

Summary¶

Symptom¶

Reproduction¶

Research and reasoning¶

Updated by mbcodeandsound (Mike Bourgeous) 11 months ago ActionsCopy link #1 [ruby-core:121557]

Updated by byroot (Jean Boussier) 11 months ago ActionsCopy link #2 [ruby-core:121573]

Updated by mame (Yusuke Endoh) 11 months ago ActionsCopy link #3 [ruby-core:121577]

Updated by byroot (Jean Boussier) 11 months ago ActionsCopy link #4 [ruby-core:121578]

Updated by byroot (Jean Boussier) 11 months ago ActionsCopy link #5 [ruby-core:121579]

Updated by mame (Yusuke Endoh) 11 months ago · Edited ActionsCopy link #6 [ruby-core:121580]

Updated by byroot (Jean Boussier) 11 months ago ActionsCopy link #7 [ruby-core:121581]

Updated by mame (Yusuke Endoh) 11 months ago 1ActionsCopy link #8 [ruby-core:121582]

Updated by mame (Yusuke Endoh) 11 months ago ActionsCopy link #9 [ruby-core:121583]

Updated by byroot (Jean Boussier) 11 months ago ActionsCopy link #10 [ruby-core:121584]

Updated by mbcodeandsound (Mike Bourgeous) 11 months ago · Edited ActionsCopy link #11 [ruby-core:121599]

Updated by mame (Yusuke Endoh) 11 months ago ActionsCopy link #12 [ruby-core:121604]

Updated by mame (Yusuke Endoh) 11 months ago ActionsCopy link #13

Updated by byroot (Jean Boussier) 11 months ago ActionsCopy link #14

Updated by byroot (Jean Boussier) 11 months ago ActionsCopy link #15 [ruby-core:121606]

Updated by mame (Yusuke Endoh) 11 months ago ActionsCopy link #16

Updated by k0kubun (Takashi Kokubun) 11 months ago ActionsCopy link #17 [ruby-core:121652]

Updated by nagachika (Tomoyuki Chikanaga) 9 months ago ActionsCopy link #18 [ruby-core:122163]

Updated by mbcodeandsound (Mike Bourgeous) 11 months ago Actions
Copy link
#1 [ruby-core:121557]

Updated by byroot (Jean Boussier) 11 months ago Actions
Copy link
#2 [ruby-core:121573]

Updated by mame (Yusuke Endoh) 11 months ago Actions
Copy link
#3 [ruby-core:121577]

Updated by byroot (Jean Boussier) 11 months ago Actions
Copy link
#4 [ruby-core:121578]

Updated by byroot (Jean Boussier) 11 months ago Actions
Copy link
#5 [ruby-core:121579]

Updated by mame (Yusuke Endoh) 11 months ago · Edited Actions
Copy link
#6 [ruby-core:121580]

Updated by byroot (Jean Boussier) 11 months ago Actions
Copy link
#7 [ruby-core:121581]

Updated by mame (Yusuke Endoh) 11 months ago 1Actions
Copy link
#8 [ruby-core:121582]

Updated by mame (Yusuke Endoh) 11 months ago Actions
Copy link
#9 [ruby-core:121583]

Updated by byroot (Jean Boussier) 11 months ago Actions
Copy link
#10 [ruby-core:121584]

Updated by mbcodeandsound (Mike Bourgeous) 11 months ago · Edited Actions
Copy link
#11 [ruby-core:121599]

Updated by mame (Yusuke Endoh) 11 months ago Actions
Copy link
#12 [ruby-core:121604]

Updated by mame (Yusuke Endoh) 11 months ago Actions
Copy link
#13

Updated by byroot (Jean Boussier) 11 months ago Actions
Copy link
#14

Updated by byroot (Jean Boussier) 11 months ago Actions
Copy link
#15 [ruby-core:121606]

Updated by mame (Yusuke Endoh) 11 months ago Actions
Copy link
#16

Updated by k0kubun (Takashi Kokubun) 11 months ago Actions
Copy link
#17 [ruby-core:121652]

Updated by nagachika (Tomoyuki Chikanaga) 9 months ago Actions
Copy link
#18 [ruby-core:122163]