Project

General

Profile

Actions

Feature #21309

closed

Can Thread::Mutex be Ractor shareable?

Added by osyoyu (Daisuke Aritomo) 6 days ago. Updated 2 days ago.

Status:
Rejected
Assignee:
-
Target version:
-
[ruby-core:121830]

Description

Background

Keeping a Mutex object in a constant or a class instance variable is a common pattern seen in code with thread safety in mind. However, this kind of code does not play well with Ractors:

require 'thread'

class C
  MUTEX = Mutex.new

  def self.foo
    MUTEX.synchronize { p 1 }
  end
end

Ractor.new {
  C.foo
}.take
t.rb:11: warning: Ractor is experimental, and the behavior may change in future versions of Ruby! Also there are many implementation issues.
#<Thread:0x000000011d80f368 run> terminated with exception (report_on_exception is true):
t.rb:7:in 'C.foo': can not access non-shareable objects in constant C::MUTEX by non-main ractor. (Ractor::IsolationError)
        from t.rb:12:in 'block in <main>'
<internal:ractor>:711:in 'Ractor#take': thrown by remote Ractor. (Ractor::RemoteError)
        from t.rb:13:in '<main>'
t.rb:7:in 'C.foo': can not access non-shareable objects in constant C::MUTEX by non-main ractor. (Ractor::IsolationError)
        from t.rb:12:in 'block in <main>'

Many libraries follow this pattern. Mutex not being Ractor shareable is blocking these libraries from being used from inside Ractors.
Timeout in stdlib in particular has large impact since it is required from many other gems by default, including net/http.

https://github.com/ruby/timeout/blob/v0.4.3/lib/timeout.rb#L49-L50
https://github.com/lostisland/faraday/blob/v2.13.1/lib/faraday/middleware.rb#L13

Proposal

Make built-in concurrency primitives (Thread::Mutex, Thread::ConditionVariable and Thread::Queue) Ractor shareable.

While this idea may not be strictly aligned with idea of the Ractor world (exchanging messages for controlling concurrency?), I have the feeling that too many code is blocked from running in Ractors because Mutex is not Ractor shareable.
Allowing Mutexes to be shared would make a large portion of existing Ruby code Ractor-compatible, or at least make migration much easier.
I believe that it won't be semantically incorrect, since they are concurrency primitives after all.

One thing to consider that the current Mutex implementation is based on the GVL (I believe so). Migration to some other implementation e.g. pthread_mutex or CRITICAL_SECTION may be needed to make Mutex work well on Ractors.

Updated by Eregon (Benoit Daloze) 6 days ago

At least for Queue it's not that simple, because it contains objects, and the invariants of Ractor (without which it would just segfault) are:

  1. no object can be accessed by multiple Ractors, unless it is a shareable object.
  2. Shareable objects can only refer to other shareable objects (otherwise it trivially breaks 1.).

Shareable objects are also typically immutable, or avoid exposing mutability to other Ractors, otherwise it introduces race conditions again and loses benefits of the actor model.

So you could have a Queue which only accepts shareable objects, but that wouldn't work for Timeout because the Request objects are mutable.
Or you could have a Queue which Ractor-moves objects, but that doesn't work for Timeout either because it keeps a reference to the enqueued object and uses it.
Also both of these alternatives are incompatible for non-Ractor semantics if applied to the core ::Queue itself.

I think @ko1 (Koichi Sasada) has a plan for Timeout specifically using Ractor-local storage, I'm waiting for his PR, I hope it won't make the code too complicated/messy.

My impression is yes it's (sometimes very) hard to make existing Ruby code Ractor-compatible, and it's due to the Ractor/actor programming model.
IMO Rubyists should just use threads, and if they want them to run in parallel use TruffleRuby or JRuby or request harder for CRuby to remove the GVL (CPython has done it, so it is feasible).

Updated by osyoyu (Daisuke Aritomo) 5 days ago

At least for Queue it's not that simple, because it contains objects

Indeed, you're right about Queue. I overlooked that. Since Queue isn't really a concurrency primitive, I think it'd be fine remaining Ractor unshareable.

On the other hand, I still believe that Mutex and ConditionVariable should be Ractor-shareable. Of course it'd be a great boost if Timeout becomes Ractor compatible, but in my view there's no harm in making Mutex shareable as well.

My impression is yes it's (sometimes very) hard to make existing Ruby code Ractor-compatible, and it's due to the Ractor/actor programming model.

I agree that there are fundamental constraints imposed by the actor model. At the same time, I think it's also true that existing code often can't run inside a Ractor simply because the necessary building blocks haven't been made Ractor compatible yet.

Updated by byroot (Jean Boussier) 5 days ago

I'm not sure I see the use case for Mutex to be shareable, at least in this specific scenario.

To take the Timeout.timeout example, making the mutex accessible from other ractors wouldn't solve the problem, because ultimately you need one timer thread and one event queue per Ractor.
So clearly the issue here is that some Process global state should be refactored to be Ractor local.

Now, more generally, if we got some shareable mutable state, then we'd need a shareable mutex. I just haven't yet encountered that case.

(CPython has done it, so it is feasible).

I don't want to go onto that debate here, but quickly: Python hasn't done it yet, it's still very much an experimental work in progress, and they may still backtrack when they figure out how much slower it end up being.
And while Python is probably the closest thing to Ruby out there, there's still some significant difference that may make it harder.

Updated by Eregon (Benoit Daloze) 5 days ago

Now, more generally, if we got some shareable mutable state, then we'd need a shareable mutex. I just haven't yet encountered that case.

Yes, and the shareable mutable state can't be accessed by Ractor, as Ractor prevents shareable mutable state entirely.
So making the Mutex shareable in most situations would likely mean it just fails a little bit later, in a way that isn't really solvable.
So I don't think it would help much if at all, but it would be interesting to see if there are examples where it would actually solve something.

For example on the Faraday case it would fail on @default_options = (on the Faraday::Middleware class).

For Timeout there is another semantics mismatch: if you use Ractors you shouldn't use Threads, otherwise you lose most benefits of using an actor model.
So basically if you use Ractor you shouldn't use Timeout, even if Timeout wouldn't raise Ractor::IsolationErrors.

IOW, my impression is people try to use Ractor because they want parallelism and they don't care about the actor model and want to keep using threads (or gems that need threads). That seems a bad match full of clashes.

there's still some significant difference that may make it harder.

I think it's actually easier, though of course a lot of work.
CPython has reference counting, which is hell for concurrency (in terms of overhead at least).
CRuby has a rather normal GC, there are tons of existing concurrent GCs which could be adapted, nothing new there (though still a lot of work).
Array/Hash/Set/etc would need to have synchronization, that I would imagine we can use the same approach as CPython or another, there are many solutions.

Updated by osyoyu (Daisuke Aritomo) 3 days ago

if we got some shareable mutable state, then we'd need a shareable mutex

Come to think of it, I now do think that Mutexes themselves are not the problem. So yes I agree with you, and am okay to close this ticket (I can come back when I find another use case).

For example on the Faraday case it would fail on @default_options = (on the Faraday::Middleware class).

Yes, and I feel that the usage of class variables in libraries are a big barrier for Ractor adoption.
Too many gems in the wild and some in the standard library (openssl to name one) uses class variables to keep some "default" values or "cached" objects, which do not need be mutable for the entire process life.

For Timeout there is another semantics mismatch: if you use Ractors you shouldn't use Threads, otherwise you lose most benefits of using an actor model.
So basically if you use Ractor you shouldn't use Timeout, even if Timeout wouldn't raise Ractor::IsolationErrors.

I am not sure about this. Sending HTTP request using net/http inside an Ractor should be a valid use case, and Timeout is blocking this (it is internally used to implement open_timeout read_timeout and write_timeout.

Updated by Eregon (Benoit Daloze) 3 days ago

  • Status changed from Open to Rejected

osyoyu (Daisuke Aritomo) wrote in #note-5:

I am not sure about this. Sending HTTP request using net/http inside an Ractor should be a valid use case, and Timeout is blocking this (it is internally used to implement open_timeout read_timeout and write_timeout.

Ideally Net::HTTP wouldn't use Timeout (which is for interrupting code using too much CPU), interrupting I/O in general should be done without Timeout (because Timeout is too heavy for that).
Maybe Net::HTTP could use IO.select?

am okay to close this ticket

I'll close it then

Updated by osyoyu (Daisuke Aritomo) 3 days ago

Maybe Net::HTTP could use IO.select?

That sounds like a good idea and I'd like to try it. I have checked some previous attempts. Should I create a ticket here?

Updated by byroot (Jean Boussier) 2 days ago

Should I create a ticket here?

You can directly open a PR on https://github.com/ruby/net-http/.

If it doesn't get noticed after a while, you can open a corresponding ticket here.

Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0Like0Like0Like0Like0