Project

General

Profile

Actions

Bug #21342

open

Segfault: invalid keeping_mutexes when using Mutex in Thread then Fiber after GC

Added by maciej.mensfeld (Maciej Mensfeld) 1 day ago. Updated 1 day ago.

Status:
Open
Assignee:
-
Target version:
-
ruby -v:
3.4.3 (2025-04-14 revision d0b7e5b6a0) +PRISM [x86_64-linux]
[ruby-core:122121]

Description

Ruby crashes with a [BUG] invalid keeping_mutexes error when attempting to GC locked mutex that was used in a Thread within a Fiber context after garbage collection. The error indicates an attempt to unlock a mutex that is not locked, suggesting a state management issue with mutexes across Thread and Fiber boundaries.

Ruby Version

ruby 3.4.3 (2025-04-14 revision d0b7e5b6a0) +PRISM [x86_64-linux]

Reproduce Process

# segv.rb

5.times do
  m = Mutex.new
  Thread.new do
    m.synchronize do
    end
  end.join
  Fiber.new do
    GC.start
    m.lock
  end.resume
end
  1. Save the above code to a file (e.g., segv.rb)
  2. Run with ruby segv.rb
  3. The crash occurs intermittently - sometimes it crashes immediately, sometimes it hangs, once in a while it works

Actual Result

The program crashes with the following error:

segv.rb: [BUG] invalid keeping_mutexes: Attempt to unlock a mutex which is not locked
ruby 3.4.3 (2025-04-14 revision d0b7e5b6a0) +PRISM [x86_64-linux]

whole segfault in the attached txt file.

Full crash backtrace shows the error originates from:

  • rb_threadptr_unlock_all_locking_mutexes in thread.c:450
  • rb_thread_terminate_all in thread.c:467

The crash suggests an issue in mutex state management during thread termination.

Expected Result

The script should complete successfully without crashing. The mutex should be properly managed across Thread and Fiber contexts, and garbage collection should not interfere with mutex state.


Files

crash.txt (23.4 KB) crash.txt maciej.mensfeld (Maciej Mensfeld), 05/15/2025 04:43 PM

Updated by masterleep2 (Bill Lipa) 1 day ago

Additionally, on macOS, the script crashes but then gets in a seemingly endless loop after printing the 'C level backtrace information' line, and can't be killed with ^C.

Updated by byroot (Jean Boussier) 1 day ago

Looks like we're missing a null check:

  * frame #0: 0x0000000100349cb4 miniruby`thread_mutex_remove(thread=0x0000000000000000, mutex=0x0000600001147f00) at thread_sync.c:225:12
    frame #1: 0x000000010033c1f0 miniruby`rb_mutex_unlock_th(mutex=0x0000600001147f00, th=0x0000000000000000, fiber=0x0000000122f0a9c0) at thread_sync.c:467:5
    frame #2: 0x0000000100349a9c miniruby`mutex_free(ptr=0x0000600001147f00) at thread_sync.c:132:27
    frame #3: 0x0000000100122bdc miniruby`rb_data_free(objspace=0x000000012380a200, obj=4349174720) at gc.c:1192:17
    frame #4: 0x0000000100122638 miniruby`rb_gc_obj_free(objspace=0x000000012380a200, obj=4349174720) at gc.c:1371:14

I suspec tthe thread was freed before the mutex.

Updated by byroot (Jean Boussier) 1 day ago

Looks like it's not that simple. This smells of memory corruption because we end up in this loop:

-> 230 	    while (*keeping_mutexes && *keeping_mutexes != mutex) {
   231 	        // Move to the next mutex in the list:
   232 	        keeping_mutexes = &(*keeping_mutexes)->next_mutex;
   233 	    }

And at some point ->next_mutex is a clearly wrong pointer (various low values such as 0xff, 0x13, etc). So I assume something else end up overwriting that memory.

All I can say is it still reproduce on master.

Actions

Also available in: Atom PDF

Like0
Like0Like0Like0