Feature #17592
closedRactor should allowing reading shareable class instance variables
Description
It would be very helpful if Ractor was allowing reading class instance variables from non-main Ractor.
Currently is raises an IsolationError:
module Foo
singleton_class.attr_accessor :config
Foo.config = {example: 42}.freeze
end
Ractor.new { p Foo.config } # => IsolationError
This limitation makes it challenging to have an efficient way to store general configs, i.e. global data that mutated a few times when resources get loaded but it immutable afterwards, and needs to be read all the time.
Currently the only way to do this is to use a constant and use remove_const
+ const_set
(which can not be made atomic easily).
I think that allowing reading only may be the best solution to avoid any race condition, e.g. two different Ractors that call @counter += 1
.
The only 3 scenarios I see here are:
0) declare the constant hack the official way to store config-style data
- allow reading of instance variables for shareable objects (as long as the data is shareable)
- allow read-write
I prefer 1)
Updated by marcandre (Marc-Andre Lafortune) over 3 years ago
Forgot to mention an example use-case: the URI global register for schemes see https://github.com/ruby/uri/pull/15
Updated by Eregon (Benoit Daloze) over 3 years ago
I think 1) or 2) is much better than 0).
And also this change will make it significantly easier to run existing code on Ractor, or to change the code to make it run with Ractor.
I think read-write would also be fine. But read-only is definitely a good step.
Concurrent @counter += 1
would already be an issue with threads, so it doesn't seem a new issue, callers need to care if they use the previous value.
For the frequent case of assigning a new shareable object, there is no problematic race there.
I suspect internally, reads and writes for module ivars will need synchronization anyway, so I guess implementation-wise it's easy to allow writes.
Something mentioned not so explicitly above is, of course, accessing @ivars
of a Module in a non-main Ractor should only be allowed if the value is shareable.
If the value is not shareable, then it must be IsolationError
.
Updated by marcandre (Marc-Andre Lafortune) over 3 years ago
From a discussion with @ko1 (Koichi Sasada), config could be (or should be?) using TVar
.
We need a good solution that is builtin.
If TVar
becomes builtin, then that is a possible solution. It seems like overkill for a mostly constant config (e.g. URI scheme list) and is not backwards compatible though.
Allowing reading
from instance variables has the advantage of being simple and backward compatible. Just to be clear: TVar
seems like interesting addition, but not particularly for config-style global state.
Updated by blowfishpro (Talia Wong) over 3 years ago
Would it be reasonable to force a class/module to be frozen (and its instance variables deep frozen) before being able to access class instance variables from a non-main Ractor?
There would probably need to be a new interface to do this as Ractor.shareable?
already returns true for a module and Ractor.make_shareable
does nothing to it.
Updated by marcandre (Marc-Andre Lafortune) over 3 years ago
blowfishpro (Joseph Wong) wrote in #note-4:
Would it be reasonable to force a class/module to be frozen (and its instance variables deep frozen) before being able to access class instance variables from a non-main Ractor?
Not really, no. Not only is it quite restrictive, but this gains nothing over status quo; you might as well use a constant, and deep-freeze it before you access it from non-main Ractor.
Updated by Dan0042 (Daniel DeLorme) over 3 years ago
If a class instance variable refers to a shareable object it would make sense to be able to read it in ractors. But what about reassigning the instance variable? How does the proposal work then?
## in main ractor:
@a = Ractor.make_shareable(data)
# @a is readable in ractors
@a = data
# @a is no longer readable in ractors? assignment disallowed?
## in non-main ractor:
@a = Ractor.make_shareable(data)
# ???
Updated by Eregon (Benoit Daloze) over 3 years ago
For simplicity, I think it's probably best to disallow reassigning in non-main Ractors.
It could be allowed though, as long as the value is shareable, then all other Ractors would notice the new value.
Reassigning in main Ractor should be allowed, and then raise when trying to read a non-shareable values from other Ractors.
i.e., the semantics of the main Ractor should not change whether there is only the main Ractor or multiple Ractors.
Updated by marcandre (Marc-Andre Lafortune) over 3 years ago
Right. My proposal disallows reassigning from non-main ractors.
Reassigning from main Ractor to a non-shareable data
is no problem, but future attempts to read it from non-main Ractor will result in IsolationError
(as they do currently).
# Main ractor:
String.singleton_class.attr_accessor :a
String.a = [].freeze
Ractor.new { p Ractor.a }.take # => []
Ractor.new { Ractor.a = nil }.take # => IsolationError (can't reassign from non-main Ractor)
String.a = []
Ractor.new { p Ractor.a } # => IsolationError (can't read non-shareable from non-main Ractor)
Updated by Dan0042 (Daniel DeLorme) over 3 years ago
That makes sense, but I'm curious about the implementation. I believe this requires synchronization of every access to a class instance variable?
Updated by marcandre (Marc-Andre Lafortune) over 3 years ago
Dan0042 (Daniel DeLorme) wrote in #note-9:
That makes sense, but I'm curious about the implementation. I believe this requires synchronization of every access to a class instance variable?
I'm not sure, but I think not. From non-main Ractor, grab the current value of the class instance variable. I think that can be made safe, e.g. if the ivar storage data uses immutable data structures.
Before returning the value, verify that it is shareable.
Objects have a "shareable" bit, but that might not be set for some shareable objects, so testing if an object is shareable might "modify" it by setting this "shareable" bit (see rb_ractor_shareable_p_continue
) but I believe that this can be done simultaneously by different Ractors without issue as it is idempotent and happens only on immutable objects. Of course, ko1 will know if I'm missing something 😅
Updated by Eregon (Benoit Daloze) over 3 years ago
Dan0042 (Daniel DeLorme) wrote in #note-9:
That makes sense, but I'm curious about the implementation. I believe this requires synchronization of every access to a class instance variable?
I think essentially all accesses to mutable data on a module must have some form of synchronization, because they can be mutated from the main thread in parallel.
So I'd think it's already synchronized.
This is BTW one thing I dislike about Ractor: @ivar accesses depend on the receiver, if it's a module it's completely different than for all other objects, and requires synchronization.
But, that's already there, so might as well allow reading from other Ractors than being both inconsistent and inconvenient.
Updated by Eregon (Benoit Daloze) over 3 years ago
To be more precise, and looking at the source, currently there is no synchronization for module ivars, only a check that only the main Ractor can access them:
https://github.com/ruby/ruby/blob/5803ac1c734568837d2010bd38f122ba24cbae2b/variable.c#L906-L913
The fast path for the "ivar set" bytecode already doesn't handle T_MODULE/T_CLASS:
https://github.com/ruby/ruby/blob/5803ac1c734568837d2010bd38f122ba24cbae2b/vm_insnhelper.c#L1250-L1253
And it ends up here which has the Module-specific logic:
https://github.com/ruby/ruby/blob/5803ac1c734568837d2010bd38f122ba24cbae2b/variable.c#L1480-L1484
Other state like constants and methods already have synchronization, e.g.,
https://github.com/ruby/ruby/blob/5803ac1c734568837d2010bd38f122ba24cbae2b/variable.c#L3621-L3625
Updated by ko1 (Koichi Sasada) over 3 years ago
This is a summary for tomorrow's dev-meeting by my understanding.
- ko1: This ticket proposal: To maintain global configuration (mutable information), class/module instance variables should be readable from other ractors.
module Foo
singleton_class.attr_accessor :config
Foo.config = {example: 42}.freeze
end
# Current
Ractor.new{ p Foo.config } # => IsolationError
# Proposal
Ractor.new do
p Foo.config #=> {example: 42}
p Foo.config = {example: 43}.freeze
#=> IsolationError, beacuse it is read-only from non-main ractors
end
Foo.config = {example: 44}.freeze # allowed updating from the main ractor
- ko1: There are two concerns: (1) atomicity concern and (2) performance concern.
(1) Atomicity concern
If two or more ivars (named @a
and @b
) should be update atomic, but threre is no way to synchronize them.
class C
@a = @b = 0
def self.update
# assertion: @a, @b should be equal
@a += 1
@b += 1
end
def self.vars
[@a, @b]
end
end
Main ractor can calls C.update
and update ivars. A ractor can call C.vars
and it can returns inconsist values (@a != @b
).
The danger of this concern is relatively low because this example is very artificial. Maybe most of usecase is initialization at loading time and no other ractors read while mutating. Also there is no coupled variables (like @a, @b
), there is no problem. For example, there is no problem with only one @config
ivar which manages all configrations. In other words, two or more configurations @configA
, @configB
, ... can have an atomicity issue.
The following code is also artifitial example.
class Fib
@a = @b = 1
# @a and @b are successive parts of the Fibonacci sequence.
def self.next
@a, @b = @b, @a + @b
end
def values
# it should return successive parts of the Fib seq.
# == "Fib seq constraint"
[@a, @b]
end
def self.eventloop
loop{
Ractor.receive; Fib.next
}
end
end
gen = Ractor.new(Ractor.current){|main| loop{ m << true } }
con = Ractor.new{
p Fib.value #=> return values can violate "Fib seq constraint"
}
If a user misused as an above example (using class/module ivars for the mutable state repository and updating them with multiple ractors (via the main-ractor)), it is danger.
For this concern, we have several options.
- (a) there is no problem to introduce this feature because it is almost safe.
- (b) Ractor is designed to avoid such consistency issues even if it can be avoided by careful programming. So this feature should not be introduced.
(b) is my position, but I agree it is very conservative.
This proposal has advantage for compatibility because many existing code can use ivars for sharing the global configurations and there is no need to rewrite them (if they only refer to sharable objects).
For example, pp
library has one global configuration: sharing_detection
which is stored in a class instance variable.
https://github.com/ruby/ruby/blob/master/lib/pp.rb#L109
For Ractor, it was rewrote by using Ractor-local configuration, but it was not ideal modification but ad-hoc modification with existing tools.
If this proposal is introduced, we can revert the ad-hoc modification.
(2) Performance concern
To allow accessing ivars from other ractors, every ivar accesses to class/module should be synchronized. Current implementation doesn't need to synchronize this accesses.
For constants and method search, we implemented const cache and method cache mechanisms. Such cache mechanisms are reasonable because they are not frequently rewriting. On the other hands, ivars can be mutated more frequently and not sure the it is reasonable to have a cache mechanism for it.
ko1: Alternative proposal is using TVar.
Rewriting configuration example with TVar:
module Foo
Config = Ractor::TVar.new{ {example: 42}.freeze }
end
Ractor.new do
Ractor.atomically do
p Foo::Config.value #=> {example: 42}
end
...
Ractor.atomically do
Foo::Config.value = {example: 43}.freeze
# modification is allowed within atomically block
end
end
Ractor.atomically do
Foo::Config.value = {example: 44}.freeze
end
The advantage of this example is it is clear that manipulating sharable state between Ractors because Ractor.atomically
method is needed. Another way of saying this is that we can become more aware of creating a shared state. With instance variables, it is hard to figure out which instance variables are shared with multiple Ractors.
Disadvantages:
- Writing
Ractor.atomically
is long to type. - Incompatible with older versions.
Configuration should not be changed frequently, so the performance should not be a problem.
module Foo
Config = Ractor::TVar.new{ {example: 42}.freeze }
end
Ractor.new do
p Foo::Config.value #=> {example: 42}
end
Ractor.atomically do
Foo::Config.value = {example: 44}.freeze
end
Rewriting other examples with TVar.
class C
A = Ractor::TVar.new 0
B = Ractor::TVar.new 0
def self.update
# assertion: A.value and B.value should be equal
Ractor.atomically do
A.value += 1
B.value += 1
end
end
def self.vars
Ractor.atomically do
[A.value, B.value]
end
end
end
class Fib
A = Ractor::TVar.new 1
B = Ractor::TVar.new 1
def self.next
Ractor.atomically do
A.value, B.value = B.value, A.value + B.value
end
end
def values
# it should return successive parts of the Fib seq.
# == "Fib seq constraint"
Ractor.atomically do
[A.value, B.value]
end
end
def self.eventloop
loop{
Ractor.receive; Fib.next
}
end
end
gen = Ractor.new(Ractor.current){|main| loop{ m << true } }
con = Ractor.new{
p Fib.value #=> TVar allows us to keep constraint
}
Fib.eventloop
# NOTE: Using TVars, there is no reason to maintain states
# by a main-ractor, so gen ractor can access Fib.next directly.
gen = Ractor.new{loop{ Fib.next } }
con = Ractor.new{
p Fib.value #=> TVar allows us to keep constraint
}
Updated by ko1 (Koichi Sasada) over 3 years ago
Eregon (Benoit Daloze) wrote in #note-11:
This is BTW one thing I dislike about Ractor: @ivar accesses depend on the receiver, if it's a module it's completely different than for all other objects, and requires synchronization.
But, that's already there, so might as well allow reading from other Ractors than being both inconsistent and inconvenient.
Sorry, I couldn't understand this point.
There is no special for class/module as a receiver objects (implementation is special, but no Ruby-level difference).
Do I miss something?
Updated by Eregon (Benoit Daloze) over 3 years ago
ko1 (Koichi Sasada) wrote in #note-14:
Sorry, I couldn't understand this point.
There is no special for class/module as a receiver objects (implementation is special, but no Ruby-level difference).
Do I miss something?
There is a behavior difference, before Ractor @foo
would always work and never raise.
And @foo =
would only raise if the receiver is frozen.
Inside a Ractor, currently @foo
works unless self.is_a?(Module)
, same for @foo =
(the inconsistency I mention: the same syntax has widely different semantics based on the receiver).
If we accept this proposal, then at least @foo
on a Module works if the value can be safely read (= the value is shareable).
Updated by ko1 (Koichi Sasada) over 3 years ago
Eregon (Benoit Daloze) wrote in #note-15:
There is a behavior difference, before Ractor
@foo
would always work and never raise.
And@foo =
would only raise if the receiver is frozen.
OK, you meant sharable or not, but not about class/module (class/module are typical cases, though).
Updated by Eregon (Benoit Daloze) over 3 years ago
ko1 (Koichi Sasada) wrote in #note-16:
OK, you meant sharable or not, but not about class/module (class/module are typical cases, though).
Isn't it actually only about class/module though?
For a Fixnum
or a Ractor.make_shareable(Object.new)
, it's allowed to read @ivars. And it's a regular FrozenError to try to write an @ivar since they are frozen.
But for class/module, and I think only for those, it's currently not allowed to even read @ivars, even though class/module are considered shareable.
Updated by marcandre (Marc-Andre Lafortune) over 3 years ago
ko1 (Koichi Sasada) wrote in #note-13:
(2) Performance concern
To allow accessing ivars from other ractors, every ivar accesses to class/module should be synchronized.
I am surprised this is the case. Would it not be possible to have overwriting an existing instance variable be a single memory write operation? Adding a new instance variable would similarly involve creating a new ivar table and setting it with a single memory write operation?
Current implementation doesn't need to synchronize this accesses.
That is not completely accurate. Current implementation does not but it is not 100% safe. The following completely artificial code crashes (sometimes):
class String
def self.test
4000.times.map { |i| eval(<<~RUBY) }.each(&:join)
Thread.new do
10.times do
Thread.pass
@x#{i} = 42
end
end
RUBY
end
end
String.test
p String.instance_variables.size # => 4000
15:19][~/temp(master)]$ ruby temp.rb
4000
[15:20][~/temp(master)]$ ruby temp.rb
4000
[15:20][~/temp(master)]$ ruby temp.rb
4000
SEGV received in SEGV handler
[BUG] Segmentation fault at 0x0000000000000380
ruby 3.0.0p0 (2020-12-25 revision 95aff21468) [x86_64-darwin18]
-- Crash Report log information --------------------------------------------
See Crash Report log file under the one of following:
* ~/Library/Logs/DiagnosticReports
* /Library/Logs/DiagnosticReports
for more details.
Don't forget to include the above Crash Report log file in bug reports.
-- Machine register context ------------------------------------------------
rax: 0x0000000000000000 rbx: 0x0000000000000000 rcx: 0x0000000000000000
rdx: 0x0000000000000000 rdi: 0x00007f89c3b7cc00 rsi: 0x00007000302b5ea4
rbp: 0x00007000302b5e60 rsp: 0x00007000302b5e40 r8: 0x0000000000198bf6
r9: 0xffffffff00000000 r10: 0x0000000106636d48 r11: 0x00007000302b5ea4
r12: 0x00007000302b5ed0 r13: 0x00007000302b5e88 r14: 0x00007f89c3b7cc00
r15: 0x00007f89c3b7cc00 rip: 0x00000001063a2f5c rfl: 0x0000000000010206
-- C level backtrace information -------------------------------------------
Abort trap: 6
Updated by Eregon (Benoit Daloze) over 3 years ago
marcandre (Marc-Andre Lafortune) wrote in #note-18:
I am surprised this is the case. Would it not be possible to have overwriting an existing instance variable be a single memory write operation? Adding a new instance variable would similarly involve creating a new ivar table and setting it with a single memory write operation?
That would lose writes if there are concurrent writes for both a new variable and an existing one (illustration).
It always needs some form of synchronization, or more indirections (e.g., chaining).
Updated by Eregon (Benoit Daloze) over 3 years ago
I should add, instance variable writes on modules on TruffleRuby are already synchronized (for modules assigned to a constant, and effectively reachable by multiple threads), and that doesn't seem to be a performance issue at all.
Reading can be done without synchronization for existing variables (just need to check if the ivar storage is large enough if it's only growing).
Updated by marcandre (Marc-Andre Lafortune) over 3 years ago
Sorry, I should have been more clear. Synchronization for writing: yes (as it should probably already be the case, see my crashing example). I meant that the access for reading wouldn't have to be synchronized I believe.
In short: allowing reads from Ractors doesn't add any need for synchronization
Updated by marcandre (Marc-Andre Lafortune) over 3 years ago
Additionally, my suggestion would make synchronization not necessary across threads in the sense that multiple writes from different threads would simply produce "bad" results but wouldn't crash. It would be up to developpers to use Mutex
in those cases. Still, probably best to synchronize writes on the main Ractor threads.
Updated by Dan0042 (Daniel DeLorme) over 3 years ago
marcandre (Marc-Andre Lafortune) wrote in #note-18:
Adding a new instance variable would similarly involve creating a new ivar table and setting it with a single memory write operation?
That seems like it would work. Then only writes would need to be synchronized.
But my understanding is that ivar tables are not ruby objects that can be automatically garbage-collected. You would need to deallocate the old table at some point when it is safe to do so (no readers currently using it).
Updated by marcandre (Marc-Andre Lafortune) over 3 years ago
- Status changed from Open to Assigned
This has been accepted 🎉
Follow-up question: could we introduce this change in 3.0.x line? Otherwise we have to wait until 3.1 and many developers might resort to the const_set
hack in the meantime...
Updated by Eregon (Benoit Daloze) almost 3 years ago
- Has duplicate Bug #18193: Accessing global configuration from Ractors added
Updated by ko1 (Koichi Sasada) almost 3 years ago
- Status changed from Assigned to Closed