Bug #21342
openSegfault: invalid keeping_mutexes when using Mutex in Thread then Fiber after GC
Description
Ruby crashes with a [BUG] invalid keeping_mutexes error
when attempting to GC locked mutex that was used in a Thread within a Fiber context after garbage collection. The error indicates an attempt to unlock a mutex that is not locked, suggesting a state management issue with mutexes across Thread and Fiber boundaries.
Ruby Version¶
ruby 3.4.3 (2025-04-14 revision d0b7e5b6a0) +PRISM [x86_64-linux]
Reproduce Process¶
# segv.rb
5.times do
m = Mutex.new
Thread.new do
m.synchronize do
end
end.join
Fiber.new do
GC.start
m.lock
end.resume
end
- Save the above code to a file (e.g.,
segv.rb
) - Run with
ruby segv.rb
- The crash occurs intermittently - sometimes it crashes immediately, sometimes it hangs, once in a while it works
Actual Result¶
The program crashes with the following error:
segv.rb: [BUG] invalid keeping_mutexes: Attempt to unlock a mutex which is not locked
ruby 3.4.3 (2025-04-14 revision d0b7e5b6a0) +PRISM [x86_64-linux]
whole segfault in the attached txt file.
Full crash backtrace shows the error originates from:
-
rb_threadptr_unlock_all_locking_mutexes
in thread.c:450 -
rb_thread_terminate_all
in thread.c:467
The crash suggests an issue in mutex state management during thread termination.
Expected Result¶
The script should complete successfully without crashing. The mutex should be properly managed across Thread and Fiber contexts, and garbage collection should not interfere with mutex state.
Files
Updated by masterleep2 (Bill Lipa) 1 day ago
Additionally, on macOS, the script crashes but then gets in a seemingly endless loop after printing the 'C level backtrace information' line, and can't be killed with ^C.
Updated by byroot (Jean Boussier) 1 day ago
Looks like we're missing a null check:
* frame #0: 0x0000000100349cb4 miniruby`thread_mutex_remove(thread=0x0000000000000000, mutex=0x0000600001147f00) at thread_sync.c:225:12
frame #1: 0x000000010033c1f0 miniruby`rb_mutex_unlock_th(mutex=0x0000600001147f00, th=0x0000000000000000, fiber=0x0000000122f0a9c0) at thread_sync.c:467:5
frame #2: 0x0000000100349a9c miniruby`mutex_free(ptr=0x0000600001147f00) at thread_sync.c:132:27
frame #3: 0x0000000100122bdc miniruby`rb_data_free(objspace=0x000000012380a200, obj=4349174720) at gc.c:1192:17
frame #4: 0x0000000100122638 miniruby`rb_gc_obj_free(objspace=0x000000012380a200, obj=4349174720) at gc.c:1371:14
I suspec tthe thread was freed before the mutex.
Updated by byroot (Jean Boussier) 1 day ago
Looks like it's not that simple. This smells of memory corruption because we end up in this loop:
-> 230 while (*keeping_mutexes && *keeping_mutexes != mutex) {
231 // Move to the next mutex in the list:
232 keeping_mutexes = &(*keeping_mutexes)->next_mutex;
233 }
And at some point ->next_mutex
is a clearly wrong pointer (various low values such as 0xff
, 0x13
, etc). So I assume something else end up overwriting that memory.
All I can say is it still reproduce on master
.