Project

General

Profile

Actions

Bug #17827

closed

Monitor is not fiber safe

Added by ioquatix (Samuel Williams) over 3 years ago. Updated almost 2 years ago.

Status:
Closed
Target version:
-
[ruby-core:103591]

Description

According to our discussion here https://github.com/rspec/rspec-support/issues/501 it seems like typical implementation of per-thread reentrant mutex is no longer valid and can lead to some deadlock situation.

#!/usr/bin/env ruby

require 'monitor'

def monitor_failure
	m = Monitor.new
	
	f1 = Fiber.new do
		m.synchronize do
			puts "f1 A"
			Fiber.yield
			puts "f1 B"
		end
	end
	
	f2 = Fiber.new do
		m.synchronize do
			puts "f2 A"
			# Fiber.yield
			f1.resume
			puts "f2 B"
		end
	end
	
	f1.resume
	f2.resume
end

monitor_failure

Related issues 1 (0 open1 closed)

Related to Ruby master - Bug #19105: mutex: Raise a ThreadError when detecting a fiber deadlockClosedActions

Updated by Eregon (Benoit Daloze) over 3 years ago

As discussed on that linked issue (https://github.com/rspec/rspec-support/issues/501#issuecomment-826315195),
the behavior should be the same as Mutex in this case, so it is a bug of the monitor stdlib.

Actions #2

Updated by Eregon (Benoit Daloze) over 3 years ago

  • Description updated (diff)
Actions #3

Updated by Eregon (Benoit Daloze) over 3 years ago

  • Backport changed from 2.6: UNKNOWN, 2.7: UNKNOWN, 3.0: UNKNOWN to 2.6: DONTNEED, 2.7: DONTNEED, 3.0: REQUIRED

Updated by Eregon (Benoit Daloze) over 3 years ago

  • Assignee set to Eregon (Benoit Daloze)
Actions #5

Updated by Eregon (Benoit Daloze) over 3 years ago

  • Status changed from Open to Closed

Applied in changeset git|3a3b19b2bba49e5d6f1cf13764eb6dd701397be9.


Fix Monitor to lock per Fiber, like Mutex [Bug #17827]

Updated by Eregon (Benoit Daloze) over 3 years ago

Fixed in https://github.com/ruby/ruby/commit/3a3b19b2bba49e5d6f1cf13764eb6dd701397be9.
This should be backported to 3.0 (I marked it as such).

I noticed something else while writing a test for this:

On 2.7:

ruby -e 'm=Mutex.new; m.lock; p 1; Fiber.new { m.lock }.resume'
1
Traceback (most recent call last):
	1: from -e:1:in `block in <main>'
-e:1:in `lock': deadlock; recursive locking (ThreadError)

On 3.0:

$ ruby -e 'm=Mutex.new; m.lock; p 1; Fiber.new { m.lock }.resume'           
1
-e:1:in `lock': No live threads left. Deadlock? (fatal)
1 threads, 1 sleeps current:0x000000000214f080 main thread:0x000000000214f080
* #<Thread:0x00000000021bb488 sleep_forever>
   rb_thread_t:0x000000000214f080 native:0x00007f05edc32b80 int:0   
	from -e:1:in `block in <main>'

$ ruby -e 'Thread.new {sleep}; sleep 0.1; m=Mutex.new; m.lock; p 1; Fiber.new { m.lock }.resume'
1
=> hangs

Same behavior on master for Monitor.new with my fix (ruby -rmonitor -e 'm=Monitor.new; m.enter; p 1; Fiber.new { p m.enter }.resume' => fatal, or hang if there is another Thread).

It would be nice to raise a ThreadError when detecting in Mutex#lock that the Mutex is owned by the same Thread but a different Fiber, if there is no Fiber scheduler.
That's more of a nice-to-have though.

Updated by ioquatix (Samuel Williams) over 3 years ago

Thanks for your work. @nagachika (Tomoyuki Chikanaga) can you please backport this? 🙏

Updated by nagachika (Tomoyuki Chikanaga) over 3 years ago

  • Backport changed from 2.6: DONTNEED, 2.7: DONTNEED, 3.0: REQUIRED to 2.6: DONTNEED, 2.7: DONTNEED, 3.0: DONE

ruby_3_0 f9196de1dee6f5ab8b6fe115070b92775a3500fe merged revision(s) 3a3b19b2bba49e5d6f1cf13764eb6dd701397be9.

Updated by byroot (Jean Boussier) almost 2 years ago

It would be nice to raise a ThreadError when detecting in Mutex#lock that the Mutex is owned by the same Thread but a different Fiber, if there is no Fiber scheduler.

I just ran into this issue, that would indeed be good to error out rather that deadlock forever.

Updated by Eregon (Benoit Daloze) almost 2 years ago

Patch welcome :)

Updated by byroot (Jean Boussier) almost 2 years ago

Yeah of course.

I'm trying to look at it, but not 100% sure what the condition should be.

Something like:

monitor.owner.thread == current_fiber.thread && !fiber_scheduler

In other word, if we're trying to enter a monitor currently owned by another fiber belonging to the same thread than us, and there is no fiber scheduler registered.

If so the issue will be to get the thread_id or ptr of a fiber, because monitor is an extension and I don't see anything in the public API that allows to tell that.

One way I guess is I can store rb_thread_current() in the monitor struct, it's a bit redundant but would work I think.

Updated by byroot (Jean Boussier) almost 2 years ago

Nevermind, I just realized that monitor uses a Mutex internally, so it's much easier to add this logic to mutexes in thread_sync.c

Actions #14

Updated by byroot (Jean Boussier) almost 2 years ago

  • Related to Bug #19105: mutex: Raise a ThreadError when detecting a fiber deadlock added
Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0