Project

General

Profile

Bug #17543

Ractor isolation broken by `self` in shareable proc

Added by marcandre (Marc-Andre Lafortune) about 2 months ago. Updated about 1 month ago.

Status:
Open
Priority:
Normal
Target version:
-
[ruby-core:102102]

Description

Discussing with MaxLap (Maxime Lapointe) we realized that the self in a shareable proc is not properly isolated:

class Foo
  attr_accessor :x

  def pr
    Ractor.make_shareable(Proc.new { self })
  end
end

f = Foo.new
f.x = [1, 2, 3]
Ractor.new(f.pr) { |pr| pr.call.x << :oops }
p f.x # => [1, 2, 3, :oops]

If the self refers to a shareable object then it's fine, but for non-shareable objects it has to be reset to nil or to a global shareable object that would have an instructive inspect.

Ractor::DETACHED_SELF = Object.new
def << Ractor::DETACHED_SELF
  def inspect
    '<#detached self>'
  end
  alias to_s inspect
end
Ractor::DETACHED_SELF.freeze

Updated by Eregon (Benoit Daloze) about 2 months ago

Probably the self of the Proc should be made Ractor.make_shareable too.
Changing the self of an arbitrary Proc would be very surprising I think.

Updated by marcandre (Marc-Andre Lafortune) about 2 months ago

Probably the self of the Proc should be made Ractor.make_shareable too.

Would that always be the case? would all the following freeze the self?

class Foo
  def test(&block)
    Ractor.make_shareable(block)
  end

  def foo
    test(&:int) # freeze self or not?
    test{ 2 + 2 } # freeze self or not?
    test{ puts 'hello' } # freeze self or not?
  end
end

The idea behind my proposed fix is to permit all three above without deep-freezing the self.

Updated by Eregon (Benoit Daloze) about 2 months ago

Not preserving self would be confusing, e.g., when calling other methods like:

class Foo
  def bar
    ...
  end

  def foo
    test { bar }
  end
end

It does mean the Foo instance would be (deeply) frozen.
It's a trade-off between being able to call other methods and freeze self, or magically instance_exec the block.

I think the only case where it's acceptable to change the self of a proc implicitly is Ractor.new {} because that already cannot capture any local variables:

ruby -e 'o=Object.new; Ractor.new { o }'                            
<internal:ractor>:267:in `new': can not isolate a Proc because it accesses outer variables (o). (ArgumentError)

Capturing variables or self is very similar IMHO, so I think they should be treated the same.

ruby -e 'o=Object.new; Ractor.make_shareable(-> { o }); p o.frozen?'
<internal:ractor>:816:in `make_shareable': can not make shareable Proc because it can refer unshareable object #<Object:0x0000000000d5ca28> from variable `o' (Ractor::IsolationError)

The most obvious fix is therefore to raise the same error if self is not shareable.

Although, I would have expected that to print true and freeze o.
What's the point of Ractor.make_shareable if it doesn't deeply make shareable as needed?

Updated by marcandre (Marc-Andre Lafortune) about 2 months ago

Eregon (Benoit Daloze) wrote in #note-3:

Not preserving self would be confusing, e.g., when calling other methods like:

class Foo
  def bar
    ...
  end

  def foo
    test { bar }
  end
end

In my proposal, calling that block would result in a NoMethodError on <#DetachedSelf>. Is it that confusing? The detached self could be created dynamically with the location of the call to make_shareable to help debugging.

I think the only case where it's acceptable to change the self of a proc implicitly

I am not sure why you state "implicitly". It is explicit that make_shareable changes the block in more ways than just freezing, in particular by detaching the local variables and binding.

is Ractor.new {} because that already cannot capture any local variables:

ruby -e 'o=Object.new; Ractor.new { o }'                            
<internal:ractor>:267:in `new': can not isolate a Proc because it accesses outer variables (o). (ArgumentError)

Capturing variables or self is very similar IMHO, so I think they should be treated the same.

ruby -e 'o=Object.new; Ractor.make_shareable(-> { o }); p o.frozen?'
<internal:ractor>:816:in `make_shareable': can not make shareable Proc because it can refer unshareable object #<Object:0x0000000000d5ca28> from variable `o' (Ractor::IsolationError)

Note that they are not treated the same way. Ractor.new{} forbids any reference to outside variables, while make_shareable(proc) allows it as long as the values are themselves shareable.

o=Object.new.freeze
Ractor.new { o } # => ArgumentError
Ractor.make_shareable(-> { o }) # => no error

The most obvious fix is therefore to raise the same error if self is not shareable.

I agree that is the most obvious, but that makes Ractor.make_shareable(-> { puts "hello" }) fail, which is far from being useful.

Updated by Eregon (Benoit Daloze) about 2 months ago

marcandre (Marc-Andre Lafortune) wrote in #note-4:

Note that they are not treated the same way. Ractor.new{} forbids any reference to outside variables, while make_shareable(proc) allows it as long as the values are themselves shareable.

Yes, Ractor.new{} seems special, but actually it seems like the Ractor.make_shareable() semantics would be more useful, that way some frozen data could be passed directly in the new Ractor.

I agree that is the most obvious, but that makes Ractor.make_shareable(-> { puts "hello" }) fail, which is far from being useful.

In my view, Ractor.make_shareable should deeply freeze/share some object graph rooted by some initial object.
So I think it should freeze captures and self for a Proc.
If it's needed to e.g. Ractor.make_shareable(self); o = Ractor.make_shareable(Object.new); Ractor.make_shareable(-> { p o; ... }), it feels so redundant.
I think Ractor.make_shareable should always succeed, except if it sees a truly unshareable object (e.g., a Mutex).

Updated by marcandre (Marc-Andre Lafortune) about 2 months ago

Eregon (Benoit Daloze) wrote in #note-5:

If it's needed to e.g. Ractor.make_shareable(self); o = Ractor.make_shareable(Object.new); Ractor.make_shareable(-> { p o; ... }), it feels so redundant.

I may be wrong, but I believe that in practice the objects that you will want to make shareable, and the blocks you want to make shareable will rarely if ever be related.

Updated by MaxLap (Maxime Lapointe) about 1 month ago

Warning: The following code examples can be ugly. This is low level stuff meant to build nicer blocks on top. Viewer discretion is advised.

As codebases using Ractors grow, I expect people would want to put the logic elsewhere, in classes, and module.

Here is a very simple idea:

class Worker
  def initialize
    @nb_iteration = 0
  end

  def work(other_ractor)
    value = 123
    other_ractor.send([:use_this_block, Ractor.make_shareable(proc { |k| k << value}) ])

    @nb_iteration += 1
  end
end

other_ractor = Ractor.new do
  # use the block with receives...
  sleep(600)
end

w = Worker.new
10.times { w.work(other_ractor) }

I would expect this to work fine. The block is really just "Code I want the other side to execute". But making self shareable would break this.

In my mind, it's a lot more confusing that later after the call, at one point, the object raises FrozenError (can't modify frozen Worker...). There won't be a helpful error message, and that call could be far away, no hints that the Ractor did it. A bit of a footgun.

And consider, if the developper needed my example to work, what would he do? I can think of many variations of "Make the self something else":

my_proc = Ractor.make_shareable(Object.instance_eval { -> (k) { k << value } })
other_ractor.send([:use_this_block, my_proc])

But now this also needs a comment, because someone seeing this will be asking questions, unless it's used everywhere (not a pretty outlook either).

It's also quite possible that the Ractor on the other side would use the block in an instance_eval, to change the self. It's a pattern that happen from time to time. In that case, the object was frozen (broken?) with no benefit.

Now, consider the alternative proposed by Marc-Andre.

The idea of making it a special object is to avoid needing special checks during the execution of a shared block, while still allowing error messages to be helpful. The inspect could be as explicit as desired: "". It can still get confusing if the self is passed around, but as soon as you try to use it, it would fail with a NoMethodError. The message could even have a link to a page with details about this and ractors.

And if the person does want to go the make self sharable way, it's easy and clear:

Ractor.make_shareable(self)
other_ractor.send([:use_this_block, Ractor.make_shareable(proc { |k| k << value})])

Updated by marcandre (Marc-Andre Lafortune) about 1 month ago

MaxLap (Maxime Lapointe) wrote in #note-7:

It's also quite possible that the Ractor on the other side would use the block in an instance_eval, to change the self.

Yes, or define_method. My solution has the advantage of not having any effect in those cases.

Updated by Eregon (Benoit Daloze) about 1 month ago

I think conceptually sharing the self would be cleaner/simpler/consistent-with-captures-vars, but I agree in practice that's rarely wanted.
So changing the receiver to something like Ractor::DETACHED_SELF is probably the most convenient.

MaxLap (Maxime Lapointe) wrote in #note-7:

And if the person does want to go the make self sharable way, it's easy and clear:

Ractor.make_shareable(self)
other_ractor.send([:use_this_block, Ractor.make_shareable(proc { |k| k << value})])

Ah, so Ractor.make_shareable would use the existing receiver if shareable and Ractor::DETACHED_SELF otherwise?
That seems nice e.g. for a block in a class method, where the class itself is shareable.
It might lead to some confusion to have such dynamic behavior though, which also depends on whether self was made shareable at some point.
Specifically, the same call site for Ractor.make_shareable(proc { |k| ... }) could have different receivers, either Ractor::DETACHED_SELF or the surrounding self if self is shareable.

Updated by ko1 (Koichi Sasada) about 1 month ago

choose nil for self for sharable Proc? No special constant is needed.

Updated by Eregon (Benoit Daloze) about 1 month ago

ko1 (Koichi Sasada) wrote in #note-10:

choose nil for self for sharable Proc? No special constant is needed.

That sounds confusing, if e.g., the NoMethodError comes from a line like [foo, bar.foo], is it because self is nil or because bar returns nil?
Seems impossible to know if detached self is nil.

Updated by marcandre (Marc-Andre Lafortune) about 1 month ago

self set to nil is a possibility, but it seems harder to debug than special purpose object. What is the "cost" of having a special object? Is there a downside?

Also nil has some additional methods that might make it even more confusing, e.g. to_h.

Also available in: Atom PDF