Project

General

Profile

Feature #17278

On-demand sharing of constants for Ractor

Added by Dan0042 (Daniel DeLorme) about 1 month ago. Updated 13 days ago.

Status:
Feedback
Priority:
Normal
Assignee:
-
Target version:
-
[ruby-core:100479]

Description

Description

This proposal aims to reduce (but not eliminate) the need for freezing/sharing boilerplate code needed by ractors.

A = [1, [2, [3, 4]]]
H = {a: "a"}
Ractor.new do
  p A  #A is not actually modified anywhere, so ok
end.take
H[:b] = "b"  #H was never touched by ractor, so ok

Background

Ractors require objects to be preemptively deep-frozen in order to be shared between ractors. This has an especially visible and restrictive effect on globals and constants. I tried thinking of a different way, and maybe I found one. So please allow me to humbly present this possibility.

Proposal

A constant would be by default in a "auto-shareable" state (A) which can change atomically to either
(B) "non-shareable" if it is modified by the main ractor
(C) "shareable" (and frozen) if it is accessed by a non-main ractor

In detail:

  1. When an object is assigned to a constant, it is added to a list of ractor-reachable objects
  2. When the first ractor is created, the objects in that list are recursively marked with FL_AUTOSHARE
    • after this point, constant assignments result directly in FL_AUTOSHARE
  3. In the main ractor, a call to rb_check_frozen (meaning the object is being modified) will
    1. if FL_AUTOSHARE is set (state A)
      • [with ractor lock]
        • unless object is shareable
          • unset FL_AUTOSHARE (state B)
    2. raise error if frozen
      • ideally with different message if object has FL_SHAREABLE
  4. When a non-main ractor accesses a non-shareable constant
    1. if object referenced by constant has FL_AUTOSHARE set (state A)
      • [with ractor lock]
        • if all objects recursively are still marked with FL_AUTOSHARE
          • make_shareable (state C)
        • else
          • unset top objects's FL_AUTOSHARE (state B)
    2. raise error if not shareable

Result

So in the case that these 2 things happen in parallel:
1) main ractor modifies content of constant X
2) non-main ractor accesses constant X

There are 2 possible outcomes:
a) main ractor error "can't modify frozen/shared object"
b) non-main ractor error "can not access non-shareable objects in constant X"

Benefits

In the normal case where non-frozen constants are left untouched after being assigned, this allows to skip a lot of .freeze or Ractor.make_shareable or # shareable_constant_value: true boilerplate.

When you get the error "can not access non-sharable objects in constant X by non-main Ractor", first you have to make that constant X shareable. Then this can trigger a secondary error that X is frozen, that you also have to debug. This way cuts the debugging in half by skipping directly to the FrozenError.

Downsides

When you get the error "can not access non-sharable objects in constant X by non-main Ractor" you may want to solve the issue by e.g. copying the constant X rather than freezing it. This way makes it slightly harder to find where X is being accessed in the non-main ractor.

In the case of conflict, whether the error occurs in the main ractor or the non-main ractor can be non-deterministic.

Applicability

This probably applies as well to global variables, class variables, and class instance variables.

Updated by ko1 (Koichi Sasada) about 1 month ago

If a non-main ractor is accessing H with immutable operation (such as .each), modifying H by main ractor will cause thread-safety issue.

Updated by Dan0042 (Daniel DeLorme) about 1 month ago

ko1 (Koichi Sasada) wrote in #note-1:

If a non-main ractor is accessing H with immutable operation (such as .each), modifying H by main ractor will cause thread-safety issue.

If a non-main ractor is accessing H, it will cause H to become frozen, so modifying H by main ractor will be impossible.

Updated by ko1 (Koichi Sasada) about 1 month ago

Oh, I see. I missed the process:

[with ractor lock]
  if all objects recursively are still marked with FL_AUTOSHARE
    make_shareable (state C)

I understand this proposal make_sharable lazily, until it is needed.
I never consider about rb_check_frozen(). Good point.

Other points:

  • is it acceptable runtime overhead (memory + traversing)? Not so heavy, but not 0. I heard there is a development rule that constants should be frozen. And if it is true, we only need to provide a convenient way to freeze (=~ make sharable) objects.
  • it can delay the bug detection, if mutation is not occur frequently (it violates early bug detection)
  • I'm not sure all objects calls rb_check_frozen() before mutating their state,,, but maybe such objects can't apply make_shareable so no problem... maybe.

Updated by Dan0042 (Daniel DeLorme) about 1 month ago

ko1 (Koichi Sasada) wrote in #note-3:

  • is it acceptable runtime overhead (memory + traversing)? Not so heavy, but not 0.

The deep-freezing traversal has to occur at some point. Eagerly when the constant is assigned or lazily when it's accessed by a ractor; it's the same cost. But if it's lazy we ensure this cost is only incurred for the minimum possible number of objects, unlike the shareable_constant_value pragma. So there may be a little gain there. The FL_AUTOSHARE marking pass (#2) is indeed an extra cost, although unlike deep_freeze it doesn't need to call user-defined methods. So yeah, overhead would be not so heavy but not 0.

I heard there is a development rule that constants should be frozen.

It's the first time I hear that, but yes it looks like rubocop has a rule like that. But not everyone uses rubocop. In my code I would consider that an anti-pattern; littering .freeze everywhere just feels too ugly to me.

  • it can delay the bug detection, if mutation is not occur frequently (it violates early bug detection)

Yes, but this is the "secondary error" effect I described above. If you detect the bug early and fix it by making the constant frozen, you still have a delayed bug waiting for you if mutation doesn't occur frequently. Unless you have the discipline to search your code for modifying operations on every constant that you make frozen.

  • I'm not sure all objects calls rb_check_frozen() before mutating their state

I can't think of a case where it's ok to mutate the state of a frozen object (apart from Queue, and even then it feels weird). That would have to be a bug right?

I understand this proposal make_sharable lazily, until it is needed.

Exactly; since you understand, would you be ok with adding this to the Developers Meeting agenda? I would like to leave this decision to you, as the Ractor developer.

Updated by Eregon (Benoit Daloze) about 1 month ago

A performance concern is this will make the first access to any constant need the process-global lock (or a more fine-grained lock, but then it will increase footprint).
That's not negligible, especially when considering to reuse JIT'ed code, where the JIT'ed code would have to handle that special first access and have an extra check and branch for it.

Also, and maybe more clearly, it would require every single Ruby constant read in the main Ractor to check that FL_AUTOSHARE flag, whether or not Ractors are used.

Updated by Eregon (Benoit Daloze) about 1 month ago

Eregon (Benoit Daloze) wrote in #note-5:

Also, and maybe more clearly, it would require every single Ruby constant read in the main Ractor to check that FL_AUTOSHARE flag, whether or not Ractors are used.

Actually not this one, based on the design above, I misunderstood.

The first concern seems still valid, except it's on the first rb_check_frozen() and not the first access.

The list of all constants values until the first Ractor is created could be quite some footprint overhead, and it has to be weak of course.
Maybe starting in step 3 avoids the needs for that list?

Updated by Dan0042 (Daniel DeLorme) about 1 month ago

Eregon (Benoit Daloze) wrote in #note-5:

A performance concern is this will make the first rb_check_frozen() to any constant need the process-global lock (or a more fine-grained lock, but then it will increase footprint).
That's not negligible, especially when considering to reuse JIT'ed code, where the JIT'ed code would have to handle that special first access and have an extra check and branch for it.

I would consider this cost negligible, because in normal code one does not modify constants, therefore rb_check_frozen() would only be called in the exceptional case that a constant is modified, and additionally the ractor lock would only occur on the first modification of such a constant. I can't speak about the performance implications regarding JIT'ed code, however it seems no worse to me than the case of trying to modify a frozen object.

The list of all constants values until the first Ractor is created could be quite some footprint overhead, and it has to be weak of course.

If the footprint overhead is too big that would be a problem, but it's quite hard to gauge how big it can realistically get. Searching for constant assignment in 582 popular gems and excluding irrelevant types such as Class, Integer and Symbol, I found 7012 constants. That would cost 55kB of footprint. Including an additional 195 aws-sdk-* gems I reach 39644 (!) constant assignments. But how many constant assignments can we find in a real-world app including dependencies?

Maybe starting in step 3 avoids the needs for that list?

I had two reasons for introducing that list

  1. If ractors are not used, we can avoid entirely the cost of marking all those constants with FL_AUTOSHARE. On the other hand that would mean the list is never released. That's not so great, but as with most performance issues it's all about the cpu/memory tradeoff. Maybe just limit the list length?
  2. Make it possible to modify a constant and still have it usable in ractors:

    A = [1,2] #without the list, this is marked FL_AUTOSHARE
    A << 3    #and so this modification causes it to become non-shared
    Ractor.new do
      p A     #so here it cannot be used
    end.take
    

Updated by matz (Yukihiro Matsumoto) about 1 month ago

  • Status changed from Open to Feedback

I understand the benefit of the proposal but the mechanism behind is too complex and can cause unexpected result with concurrency (e.g. race conditions). It's the code (design) smell. We need more time to investigate further.

Matz.

Updated by Dan0042 (Daniel DeLorme) about 1 month ago

The mechanism's implementation may be a little complex but I think usage is quite simple. In the vast majority of cases the developer doesn't need to think about anything; existing constants will just work with ractors. It's like GC vs manual memory management; GC is more complex but it's worth it, right?

As for race conditions, I think the only one possible is that if an error occurs, whether it occurs in the main ractor or the non-main ractor can be non-deterministic. I can understand that seems like a design smell, but imho it's a much worse smell if developers have to opt-in to Ractor compatibility by manually freezing their constants. That's going to make Ractor adoption much harder than it needs to be; simple gems that could be used in a Ractor will need to be patched simply because they use non-frozen constants.

Updated by ko1 (Koichi Sasada) about 1 month ago

I also +1 for Matz because of non-deterministic behavior.

Updated by Dan0042 (Daniel DeLorme) 13 days ago

In #17323#note-5, ko1 mentioned there is a possibility to provide "fork" model. So I tried thinking if it could apply here.

We can imagine that accessing an auto-shareable constant
a) from non-main ractor: is made shareable
b) from main ractor: is made shareable, and then a deep-dup copy is made and set aside for use for main ractor only

In this case it's no longer non-deterministic, but there are other tradeoffs. Memory usage is double. Most importantly, the constant may have a diverging value in the main ractor. Let's say you have COUNTERS = Hash.new(0) and the counters are only incremented in the main ractor; but from the perspective of the non-main ractors the counters would always be zero. I would find this very unintuitive, and likely very hard to debug.

I think the forking model could work for class variables because it's less surprising if a variable has a different value in different contexts.

In the end I remain convinced the original model I proposed is best. To a certain extent, non-deterministic behavior is a normal part of parallelism. For example in a producer/consumer architecture, if 2 producers generate each 1M 'A's and 1M 'B's, the consumer will see them in a non-deterministic order. No one would claim that's a problem. It's the same thing for this; the order may be non-deterministic (iif there's a race condition) but the end result is identical: an error.

Also available in: Atom PDF