Feature #21039
openRactor.make_shareable breaks block semantics (seeing updated captured variables) of existing blocks
Description
def make_counter
count = 0
nil.instance_exec do
[-> { count }, -> { count += 1 }]
end
end
get, increment = make_counter
reader = Thread.new {
sleep 0.01
loop do
p get.call
sleep 0.1
end
}
writer = Thread.new {
loop do
increment.call
sleep 0.1
end
}
ractor_thread = Thread.new {
sleep 1
Ractor.make_shareable(get)
}
sleep 2
This prints:
1
2
3
4
5
6
7
8
9
10
10
10
10
10
10
10
10
10
10
10
But it should print 1..20, and indeed it does when commenting out the Ractor.make_shareable(get)
.
This shows a given block/Proc instance is concurrently broken by Ractor.make_shareable
, IOW Ractor is breaking fundamental Ruby semantics of blocks and their captured/outer variables or "environment".
It's expected that Ractor.make_shareable
can freeze
objects and that may cause some FrozenError, but here it's not a FrozenError, it's wrong/stale values being read.
I think what should happen instead is that Ractor.make_shareable
should create a new Proc and mutate that.
However, if the Proc is inside some other object and not just directly the argument, that wouldn't work (like Ractor.make_shareable([get])
).
So I think one fix would to be to only accept Procs for Ractor.make_shareable(obj, copy: true)
.
FWIW that currently doesn't allow Procs, it gives <internal:ractor>:828:in 'Ractor.make_shareable': allocator undefined for Proc (TypeError)
.
It makes sense to use copy
here since make_shareable
effectively takes a copy/snapshot of the Proc's environment.
I think the only other way, and I think it would be a far better way would be to not support making Procs shareable with Ractor.make_shareable
.
Instead it could be some new method like isolated { ... }
or Proc.isolated { ... }
or Proc.snapshot_outer_variables { ... }
or so, only accepting a literal block (to avoid mutating/breaking an existing block), and that would snapshot outer variables (or require no outer variables like Ractor.new's block, or maybe even do Ractor.make_shareable(copy: true)
on outer variables) and possibly also set self
since that's anyway needed.
That would make such blocks with different semantics explicit, which would fix the problem of breaking the intention of who wrote that block and whoever read that code, expecting normal Ruby block semantics, which includes seeing updated outer variables.
Related: #21033 https://bugs.ruby-lang.org/issues/18243#note-5
Extracted from https://bugs.ruby-lang.org/issues/21033#note-14
Updated by Eregon (Benoit Daloze) 6 months ago
- ruby -v set to ruby 3.4.1 (2024-12-25 revision 48d4efcb85) +PRISM [x86_64-linux]
Updated by Eregon (Benoit Daloze) 6 months ago
- Related to Bug #18243: Ractor.make_shareable does not freeze the receiver of a Proc but allows accessing ivars of it added
Updated by Eregon (Benoit Daloze) 6 months ago
- Related to Feature #21033: Allow lambdas that don't access `self` to be Ractor shareable added
Updated by luke-gru (Luke Gruber) 6 months ago
As far as I know this is intentional behavior, so even though I agree it is confusing I think this is more accurately a feature request instead of a bug.
Updated by Eregon (Benoit Daloze) 6 months ago
I think it's a bug, because it breaks fundamental Ruby block semantics.
No Ractor method or functionality should be able to do that for an existing Proc, even more so when that Proc is called on the main Ractor.
Updated by luke-gru (Luke Gruber) 6 months ago
Okay fair enough, and it's not of much consequence either way whether it's a bug or a feature because your point still stands.
Updated by matz (Yukihiro Matsumoto) 6 months ago
- Tracker changed from Bug to Feature
- ruby -v deleted (
ruby 3.4.1 (2024-12-25 revision 48d4efcb85) +PRISM [x86_64-linux]) - Backport deleted (
3.1: UNKNOWN, 3.2: UNKNOWN, 3.3: UNKNOWN, 3.4: UNKNOWN)
- It's intended behavior. We don't call it a bug. If we keep the non-shareable semantics, the basic principle of Ractors would fail (no shared state).
- I understand your feeling to the new behavior. The behavior can be negotiable, but it should follow Ractor principle.
Matz.
Updated by hsbt (Hiroshi SHIBATA) 4 months ago
- Status changed from Open to Assigned
Updated by Eregon (Benoit Daloze) 30 days ago
I think a good solution here would be:
- raise on
Ractor.make_shareable(proc)
- raise on
Ractor.make_shareable(proc, copy: true)
- allow
Ractor.make_shareable { ... }
but only with a block literal. That way, the fact that block behaves differently by copying its environment is very clear. The method could check that the block doesn't do assignments in the environment as well to avoid surprises. It could also set theself
in the block tonil
, and/or take a keyword argument to set the receiver, or use the block's original self and error if it cannot be made shareable.
Updated by matz (Yukihiro Matsumoto) 16 days ago
I'd like to have Ractor.shareble_proc
and Ractor.sharable_lambda
. See the agenda from 20250710 developer meeting.
Matz.
Updated by mame (Yusuke Endoh) 15 days ago
-
Ractor.shareable_proc { }
returns a Proc that is shareable between ractors. In the proc,self
is nil. -
Ractor.shareable_proc(self: 42) { }
returns a Proc that is shareable between ractors. In the proc,self
is42
. -
Ractor.shareable_lambda
returns a lambda-version ofRactor.shareable_proc
.
Updated by mame (Yusuke Endoh) 15 days ago
The followings are also approved; changing an existing proc object to shareable should be prohibited.
- raise on
Ractor.make_shareable(proc)
- raise on
Ractor.make_shareable(proc, copy: true)
Updated by Eregon (Benoit Daloze) 11 days ago
Great to hear, this makes a lot of sense and addresses the original semantics issue perfectly.
Updated by tenderlovemaking (Aaron Patterson) 4 days ago
I think these make sense, but I would also like to propose that Ractor.shareable_proc
take a block that isn't a literal and returns a new proc that is shareable.
For example:
not_shareable = ->{ ... }
shareable = Ractor.shareable_proc(¬_shareable)
I think forcing the proc to be a literal put a very big limitation on Ractor usefulness. A simple example is a Sinatra app:
# A simple Sinatra app
get "/" do
"Hello world"
end
Since the Sinatra API uses procs, we wouldn't be able to serve the Sinatra request from a Ractor. If we can pass a non-literal block to Ractor.shareable_proc
, I think that would solve the issue.
Updated by Eregon (Benoit Daloze) 4 days ago
ยท Edited
@tenderlovemaking (Aaron Patterson) The issue with that is it still breaks the block semantics as in the OP description, specifically reading of captured variables inside the block is snapshotted for Ractor-shareable-blocks:
$ ruby -e 'count = 0; b = nil.instance_exec { -> { count } }; p b.call; count += 1; p b.call'
0
1
$ ruby -e 'count = 0; b = nil.instance_exec { -> { count } }; b2 = Ractor.shareable_proc(&b); p b2.call; count += 1; p b2.call'
0
0
It's the general guarantee in Ruby that a given literal block always behave the same, e.g. it's either a proc or lambda, but not both (except when using send(condition ? :proc : :lambda) { ... }
but that's explicit then), and in this case it's either a block respecting updated captures variables or not.
So how to keep this important guarantee (i.e. the block author can rely on captured variables to behave as they always have been for Ruby blocks and respect reassignments) and allow the flexibility you want?
BTW Ractor.make_shareable
on a Proc which assigns captured variables is an error (good, better to fail early than silently ignore the write):
$ ruby -e 'count = 0; b = nil.instance_exec { -> { count += 1 } }; Ractor.make_shareable b'
-e:1:in 'Ractor.make_shareable': can not make a Proc shareable because it accesses outer variables (count). (ArgumentError)
from -e:1:in '<main>'
Maybe one way would be for Ractor.shareable_proc
to be an error if there is any code around that Proc assigning any captured variable?
It can't detect binding
and eval
though, so that's still not complete.
One way for that Sinatra case would be to write:
get "/", &Ractor.shareable_proc {
"Hello world"
}
but this only really works if there are no captured variables, or captured variables are not reassigned and the contents of captured variables is shareable, so probably in many realistic cases it doesn't work anyway.
Updated by tenderlovemaking (Aaron Patterson) 2 days ago
I understand your argument, but I don't agree this is an issue.
ruby -e 'count = 0; b = nil.instance_exec { -> { count } }; b2 = Ractor.shareable_proc(&b); p b2.call; count += 1; p b2.call'
In this code we've explicitly converted b
to a shareable proc, b2
. The value of count
is internally consistent from the standpoint of b2
since shareable_proc
explicitly copied / disconnected the captured environment. I understand the problem you're pointing out, but I'm not convinced it would be a big deal in practice.
get "/", &Ractor.shareable_proc {
"Hello world"
}
This can't be a serious suggestion? It's basically saying that no existing Sinatra code could run inside a Ractor based webserver. If we had new syntax for Ractor.shareable_proc
, I could see that being easier to swallow, but this doesn't seem acceptable (to me at least).
Updated by Eregon (Benoit Daloze) 1 day ago
tenderlovemaking (Aaron Patterson) wrote in #note-18:
This can't be a serious suggestion? It's basically saying that no existing Sinatra code could run inside a Ractor based webserver. If we had new syntax for
Ractor.shareable_proc
, I could see that being easier to swallow, but this doesn't seem acceptable (to me at least).
Yeah, I understand your concern, and I meant this mostly as a workaround, while finding what other parts of Ractor prevents using Ractor for realistic Ruby code.
OTOH I'm rather skeptical that even if Ractor.shareable_proc
would be allowed with a non-literal block that we'd be able to run Sinatra (or Rails, etc) apps on Ractors.
Not even MSpec runs on Ractor, and that's pretty simple logic, yet making it Ractor-compatible in a non-ugly-and-complicated way seems very hard.
I think it would be good to have a construct (be it syntax or a Kernel or Proc method), independent of Ractor, to create a Proc which snapshots its environment, and is not allowed to write to its environment.
That way, that Proc would have the same semantics whether Ractors are used or not.
That concept on its own is very useful for optimizations and JITs, in fact TruffleRuby already has this functionality internally.
One complication though is it's pretty expensive to do this, as on every call to that method it allocates a new Proc and potentially copies + change the bytecode to replace captured variable reads with their values (thought there might be other ways to do this).
Having it as syntax or an intrinisified method (which would mean cannot be redefined and must be detectable at parse time, no metaprogramming call to it) would help.
It would be useful for define_method
too, and would mean methods defined with define_method
could be as fast as def
when called.
To make it useful for Ractor we'd need that construct to also support setting the receiver, as in the Ractor.shareable_proc(self: 42) { }
case from above.
And it would also need to either check that captured variable values are shareable, or make them shareable. That part is a bit weird when not using Ractors though, especially making shareable.
Checking for shareable seems better anyway, because making them shareable would need to copy to be safe in general, but maybe the user want to do it inplace if they know that's safe, etc.
It could be something like:
captured = 7
p = Proc.isolated(self: 6) { self * captured }
captured = 10
p.call # => 42
Syntax seems better-defined for the semantics and probably would look cleaner, but I'm not sure what would be a good syntax, and then of course it can't work on older Ruby versions at all, even when not using Ractors.
Unless we use something cheeky like proc { |a,b; isolated| }
maybe, but that wouldn't allow setting the self (would have to be done with instance_exec
around it), and would have different semantics on different versions which is not great.
Updated by tenderlovemaking (Aaron Patterson) 1 day ago
Eregon (Benoit Daloze) wrote in #note-19:
tenderlovemaking (Aaron Patterson) wrote in #note-18:
This can't be a serious suggestion? It's basically saying that no existing Sinatra code could run inside a Ractor based webserver. If we had new syntax for
Ractor.shareable_proc
, I could see that being easier to swallow, but this doesn't seem acceptable (to me at least).Yeah, I understand your concern, and I meant this mostly as a workaround, while finding what other parts of Ractor prevents using Ractor for realistic Ruby code.
OTOH I'm rather skeptical that even ifRactor.shareable_proc
would be allowed with a non-literal block that we'd be able to run Sinatra (or Rails, etc) apps on Ractors.
Not even MSpec runs on Ractor, and that's pretty simple logic, yet making it Ractor-compatible in a non-ugly-and-complicated way seems very hard.
I don't have any numbers, but my intuition is that most non-literal, global procs (procs that are reachable via the application), including ones provided by the user, don't depend on environment mutations that occur outside of the block. Outside of iterators, depending on mutations to one's captured environment would be extremely confusing and hard to track behavior, so IME most people don't do it in practice.
But besides that, I'm just proposing that Ractor.shareable_proc
be allowed to take a non-literal block. This would allow frameworks to pick and choose which lambdas should be "safe" for a Ractor.
If someone had a Sinatra app that depended on env mutations like this:
counter = 0
get "/" do # assume the proc gets copied here so counter is 0
"Hello world #{counter}"
counter += 1
end
counter += 123
I think a user running a Ractor webserver would report an issue with the webserver since it would behave "as expected" on a non-Ractor webserver.
I think it would be good to have a construct (be it syntax or a Kernel or Proc method), independent of Ractor, to create a Proc which snapshots its environment, and is not allowed to write to its environment.
That way, that Proc would have the same semantics whether Ractors are used or not.
I'm not sure why it matters whether the proc can write to the captured env or not, since we can just copy the environment and attach it to the proc. To me this is similar to dup'ing an object. If I mutate one copy, I don't expect those mutations to be reflected in the other.
That concept on its own is very useful for optimizations and JITs, in fact TruffleRuby already has this functionality internally.
One complication though is it's pretty expensive to do this, as on every call to that method it allocates a new Proc and potentially copies + change the bytecode to replace captured variable reads with their values (thought there might be other ways to do this).
Having it as syntax or an intrinisified method (which would mean cannot be redefined and must be detectable at parse time, no metaprogramming call to it) would help.
It would be useful fordefine_method
too, and would mean methods defined withdefine_method
could be as fast asdef
when called.
I've thought of doing this by mprotecting the escaped env and only allowing reads ๐
To make it useful for Ractor we'd need that construct to also support setting the receiver, as in the
Ractor.shareable_proc(self: 42) { }
case from above.
And it would also need to either check that captured variable values are shareable, or make them shareable. That part is a bit weird when not using Ractors though, especially making shareable.
Checking for shareable seems better anyway, because making them shareable would need to copy to be safe in general, but maybe the user want to do it inplace if they know that's safe, etc.
It could be something like:captured = 7 p = Proc.isolated(self: 6) { self * captured } captured = 10 p.call # => 42
Syntax seems better-defined for the semantics and probably would look cleaner, but I'm not sure what would be a good syntax, and then of course it can't work on older Ruby versions at all, even when not using Ractors.
Unless we use something cheeky likeproc { |a,b; isolated| }
maybe, but that wouldn't allow setting the self (would have to be done withinstance_exec
around it), and would have different semantics on different versions which is not great.
Anyway, by not allowing non-literal blocks, we can't even abstract calls to shareable_proc
and I think that really hampers the usefulness of Ractors. I really think we should find a way to support non-literal blocks.