Bug #17827
closedMonitor is not fiber safe
Description
According to our discussion here https://github.com/rspec/rspec-support/issues/501 it seems like typical implementation of per-thread reentrant mutex is no longer valid and can lead to some deadlock situation.
#!/usr/bin/env ruby
require 'monitor'
def monitor_failure
m = Monitor.new
f1 = Fiber.new do
m.synchronize do
puts "f1 A"
Fiber.yield
puts "f1 B"
end
end
f2 = Fiber.new do
m.synchronize do
puts "f2 A"
# Fiber.yield
f1.resume
puts "f2 B"
end
end
f1.resume
f2.resume
end
monitor_failure
Updated by Eregon (Benoit Daloze) over 3 years ago
As discussed on that linked issue (https://github.com/rspec/rspec-support/issues/501#issuecomment-826315195),
the behavior should be the same as Mutex
in this case, so it is a bug of the monitor stdlib.
Updated by Eregon (Benoit Daloze) over 3 years ago
- Backport changed from 2.6: UNKNOWN, 2.7: UNKNOWN, 3.0: UNKNOWN to 2.6: DONTNEED, 2.7: DONTNEED, 3.0: REQUIRED
Updated by Eregon (Benoit Daloze) over 3 years ago
- Assignee set to Eregon (Benoit Daloze)
Updated by Eregon (Benoit Daloze) over 3 years ago
- Status changed from Open to Closed
Applied in changeset git|3a3b19b2bba49e5d6f1cf13764eb6dd701397be9.
Fix Monitor to lock per Fiber, like Mutex [Bug #17827]
Updated by Eregon (Benoit Daloze) over 3 years ago
Fixed in https://github.com/ruby/ruby/commit/3a3b19b2bba49e5d6f1cf13764eb6dd701397be9.
This should be backported to 3.0 (I marked it as such).
I noticed something else while writing a test for this:
On 2.7:
ruby -e 'm=Mutex.new; m.lock; p 1; Fiber.new { m.lock }.resume'
1
Traceback (most recent call last):
1: from -e:1:in `block in <main>'
-e:1:in `lock': deadlock; recursive locking (ThreadError)
On 3.0:
$ ruby -e 'm=Mutex.new; m.lock; p 1; Fiber.new { m.lock }.resume'
1
-e:1:in `lock': No live threads left. Deadlock? (fatal)
1 threads, 1 sleeps current:0x000000000214f080 main thread:0x000000000214f080
* #<Thread:0x00000000021bb488 sleep_forever>
rb_thread_t:0x000000000214f080 native:0x00007f05edc32b80 int:0
from -e:1:in `block in <main>'
$ ruby -e 'Thread.new {sleep}; sleep 0.1; m=Mutex.new; m.lock; p 1; Fiber.new { m.lock }.resume'
1
=> hangs
Same behavior on master for Monitor.new
with my fix (ruby -rmonitor -e 'm=Monitor.new; m.enter; p 1; Fiber.new { p m.enter }.resume'
=> fatal, or hang if there is another Thread).
It would be nice to raise a ThreadError when detecting in Mutex#lock that the Mutex is owned by the same Thread but a different Fiber, if there is no Fiber scheduler.
That's more of a nice-to-have though.
Updated by ioquatix (Samuel Williams) over 3 years ago
Thanks for your work. @nagachika (Tomoyuki Chikanaga) can you please backport this? 🙏
Updated by nagachika (Tomoyuki Chikanaga) over 3 years ago
- Backport changed from 2.6: DONTNEED, 2.7: DONTNEED, 3.0: REQUIRED to 2.6: DONTNEED, 2.7: DONTNEED, 3.0: DONE
ruby_3_0 f9196de1dee6f5ab8b6fe115070b92775a3500fe merged revision(s) 3a3b19b2bba49e5d6f1cf13764eb6dd701397be9.
Updated by ioquatix (Samuel Williams) over 3 years ago
@nagachika (Tomoyuki Chikanaga) thank you so much for your effort.
Updated by byroot (Jean Boussier) about 2 years ago
It would be nice to raise a ThreadError when detecting in Mutex#lock that the Mutex is owned by the same Thread but a different Fiber, if there is no Fiber scheduler.
I just ran into this issue, that would indeed be good to error out rather that deadlock forever.
Updated by Eregon (Benoit Daloze) about 2 years ago
Patch welcome :)
Updated by byroot (Jean Boussier) about 2 years ago
Yeah of course.
I'm trying to look at it, but not 100% sure what the condition should be.
Something like:
monitor.owner.thread == current_fiber.thread && !fiber_scheduler
In other word, if we're trying to enter a monitor currently owned by another fiber belonging to the same thread than us, and there is no fiber scheduler registered.
If so the issue will be to get the thread_id or ptr of a fiber, because monitor
is an extension and I don't see anything in the public API that allows to tell that.
One way I guess is I can store rb_thread_current()
in the monitor struct, it's a bit redundant but would work I think.
Updated by byroot (Jean Boussier) about 2 years ago
Nevermind, I just realized that monitor uses a Mutex internally, so it's much easier to add this logic to mutexes in thread_sync.c
Updated by byroot (Jean Boussier) about 2 years ago
- Related to Bug #19105: mutex: Raise a ThreadError when detecting a fiber deadlock added