Project

General

Profile

Actions

Feature #22097

open

Add Proc#with_refinements

Feature #22097: Add Proc#with_refinements
1

Added by shugo (Shugo Maeda) 5 days ago. Updated about 12 hours ago.

Status:
Open
Assignee:
-
Target version:
-
[ruby-core:125650]

Description

Abstract

I propose Proc#with_refinements(mod, ...) to support block-level refinements.

module StringExt
  refine String do
    def shout = upcase + "!"
  end
end

original = ->(s) { s.shout }
refined = original.with_refinements(StringExt)
p refined.call("hello")  # "HELLO!"
p original.call("hello") # NoMethodError

When no argument is given, ArgumentError is raised.
When a non-Module argument is given, TypeError is raised.

Background and Motivation

I previously proposed Proc#using in [Feature #16461], but it introduced semantic complexities because it mutated existing blocks.

Instead of mutating the existing block, Proc#with_refinements returns a new Proc object with its own isolated call sites.
This approach makes its semantics much simpler than Proc#using, and it avoids thread-safety issues and plays nicely with inline caches.

Limitations

  • Similar to Proc#binding, Proc#with_refinements raises ArgumentError if the
    receiver is not created from a Ruby block.
:to_s.to_proc.with_refinements(StringExt) #=> ArgumentError
  • Chained application of Proc#with_refinements is not allowed. ArgumentError is
    raised if the receiver is a Proc returned by Proc#with_refinements.
refined = prc.with_refinements(StringExt)
refined.with_refinements(IntegerExt) #=> ArgumentError
  • define_method (and define_singleton_method) rejects a Proc with refinements.
    ArgumentError is raised if the return value of Proc#with_refinements is given to
    define_method.
refined = prc.with_refinements(StringExt)
define_method(:foo, &refined) #=> ArgumentError

Implementation

I've opened a pull request: https://github.com/ruby/ruby/pull/17248

A PoC for JRuby is also available at: https://github.com/jruby/jruby/pull/9486

Data structure changes

  • Added a bit field has_refinements to rb_proc_t.
  • Added a hidden instance variable to Proc to store a cref with the applied refinements.
  • Added a single-entry cache refinement_memo to rb_iseq_constant_body.

Deep copy of iseq and caching

Proc#with_refinements performs a deep copy of the receiver's iseq to isolate its call sites from the original Proc.
While a deep copy can be an expensive operation, the single-entry cache in rb_iseq_constant_body mitigates this overhead effectively for most practical use cases where the same refinements are applied repeatedly.

Overhead for code not using Proc#with_refinements

  • Memory footprint: Neither internal structure grows in size. has_refinements is a 1-bit field added to rb_proc_t's existing bit field, and refinement_memo shares a union with mandatory_only_iseq in rb_iseq_constant_body.
  • Execution speed: The common Proc#call path is kept frameless and only adds a single has_refinements bit check.
  • GC: The mark/free/memsize functions add a single branch per iseq to select the union member.

Benchmark results: https://gist.github.com/shugo/ddfe92f28ea31e6527a2f270e6daee7c

Here's an excerpt from the results, where compare-ruby is master and built-ruby is the branch for this feature (focusing on Proc/Block operations):

compare-ruby built-ruby
vm_proc 47.215M 46.149M
1.02x -
vm_yield 1.649 1.754
- 1.06x

Related issues 2 (2 open0 closed)

Related to Ruby - Feature #16461: Proc#usingAssignedmatz (Yukihiro Matsumoto)Actions
Related to Ruby - Feature #12086: using: option for instance_eval etc.OpenActions

Updated by shugo (Shugo Maeda) 5 days ago Actions #1

Updated by headius (Charles Nutter) 5 days ago 1Actions #2 [ruby-core:125651]

Thank you for considering JRuby! I will review your PR and also start a ruby-4.1 branch you can target.

Updated by shugo (Shugo Maeda) 5 days ago Actions #3

  • Description updated (diff)

Updated by shugo (Shugo Maeda) 5 days ago Actions #4 [ruby-core:125658]

headius (Charles Nutter) wrote in #note-2:

Thank you for considering JRuby! I will review your PR and also start a ruby-4.1 branch you can target.

Thank you!
I've opened a new pull request at: https://github.com/jruby/jruby/pull/9486

Updated by shugo (Shugo Maeda) 5 days ago Actions #5

  • Description updated (diff)

Updated by shugo (Shugo Maeda) 5 days ago Actions #6

  • Description updated (diff)

Updated by shugo (Shugo Maeda) 4 days ago Actions #7 [ruby-core:125699]

For maintainability, I've replaced the hand-written iseq deep-copy with an in-memory IBF dump+load round-trip in: https://github.com/ruby/ruby/pull/17248/changes/f27cf1d98c18f4137ace0243ca696ba3e17834af

Updated by Eregon (Benoit Daloze) 3 days ago ยท Edited Actions #8 [ruby-core:125709]

Since the performance relies on having with_refinements called always with the same Refinement module for a given block, how about raising an exception if it doesn't hold?
Then we effectively have a guarantee vs very slow performance for e.g. loop { original.with_refinements(A); original.with_refinements(B) } (silly example, but could happen naturally in a bigger app).

Semantically, nested blocks also get access to the refinements, as shown in test_with_refinements_nested_block, or for clarity:

module StringExt
  refine String do
    def shout = upcase + "!"
  end
end

original = ->(s) { -> { s.shout }.call }
refined = original.with_refinements(StringExt)
p refined.call("hello")  # "HELLO!"

This is what I would expect, just I didn't see that in the description.

Copying a block IR's, and the IR of all nested blocks (IR = bytecode for CRuby) is quite expensive.
It's cached but it's still going to be a significant cost on either application startup/on the first request/etc.
It would be good to get some numbers on that, e.g. creating and calling N blocks vs the same but also using with_refinements.

The increased memory footprint would be worth documenting.

Semantically, this means a given block (the lexical construct) can behave significantly differently based on calls to the original Proc or the with_refinements Proc.
It's a bit like a given block being both a lambda and a proc, that's confusing and generally forbidden (except send(rand < 0.5 ? :lambda : :proc) { ... } but that's obvious; lambda(&b) is forbidden for this reason).
Or similar to the issues we had with Ractor.make_shareable (which we solved by making the semantics much more similar and error if it would be too different).
In summary: observable different semantics for the same block is always surprising, because hard to explain and to debug.
IOW, it can break the author of the block's intention, by changing what a given piece of Ruby code means.

I suppose the general expectation here is only the refined block is called and the original block is never called.
If that holds I think it's fine, the problem is how to make it hold?
To make the semantics cleaner, maybe we should prevent the original block to be called (i.e. raise an exception if it's called) once with_refinements has been called on it?
(note: this would be stored in the block, so for all Proc instances of that block)
One might still call the original block, then use with_refinements and observe the mixed semantics but that becomes a much narrower case.

One way to fully address that would be to make this lexical, like:

proc_using_refinements(A) do
  ...
end

and error if proc_using_refinements is not called with a literal block.
Or maybe tweak the lambda operator like e.g.:

->(s) [StringExt] { s.shout }

But I guess the use case here wants more flexibility?

Updated by shugo (Shugo Maeda) 3 days ago Actions #9 [ruby-core:125719]

Thank you for the feedback!

Eregon (Benoit Daloze) wrote in #note-8:

Since the performance relies on having with_refinements called always with the same Refinement module for a given block, how about raising an exception if it doesn't hold?

I would prefer not to. The memo is just a cache, and this restriction would make the cache observable: whether prc.with_refinements(B) succeeds would depend on whether some other code called it with A before. For example, two libraries applying different refinements to Procs created from the same block would conflict, and the failure would depend on call order. That seems harder to debug than the performance issue it prevents.

Instead, how about emitting a performance warning (Warning[:performance], like the object shapes warnings) when with_refinements discards the cached copy because it was called with different modules for the same block? That makes the performance issue visible without changing the semantics.

Semantically, nested blocks also get access to the refinements, as shown in test_with_refinements_nested_block, or for clarity:

Yes, nested blocks also see the refinements. This is intended behavior. The refinements also apply to methods defined with def inside the body. I have documented this in the RDoc in https://github.com/ruby/ruby/pull/17248/changes/5c84f091bb3f01a646554b368c62b197c3d6c700

Copying a block IR's, and the IR of all nested blocks (IR = bytecode for CRuby) is quite expensive.
It's cached but it's still going to be a significant cost on either application startup/on the first request/etc.
It would be good to get some numbers on that, e.g. creating and calling N blocks vs the same but also using with_refinements.

That makes sense.
Here are some numbers, with 10,000 distinct blocks of a realistic size (about 15 lines, 2 nested blocks, so each copy duplicates 3 iseqs):

call 10000 original blocks                             27.2 ms  (2.72 us/block)
with_refinements x10000 (first time: copy)            261.8 ms  (26.18 us/block)
call 10000 refined blocks                              27.6 ms  (2.76 us/block)
with_refinements x10000 (memoized)                      3.2 ms  (0.32 us/block)
with_refinements x10000, alternating A/B              251.8 ms  (25.18 us/block)
iseq tree size: original 4040 bytes, copy 3992 bytes

So the copy costs about 26 us and 4 KB per block per module set, and it happens only once thanks to the memoization. Even 10,000 refined blocks add only ~0.3 seconds to startup. Call speed is the same as the original.

The full script is at: https://gist.github.com/shugo/07e62c44bc4765ecff6d2b8e704b5f38

The increased memory footprint would be worth documenting.

I have documented the memory footprint in the RDoc in https://github.com/ruby/ruby/pull/17248/changes/5c84f091bb3f01a646554b368c62b197c3d6c700

To make the semantics cleaner, maybe we should prevent the original block to be called (i.e. raise an exception if it's called) once with_refinements has been called on it?

In the intended use cases, only the refined Proc is called. But I would prefer not to enforce it, for two reasons:

  • Calling both is well-defined: each Proc behaves consistently, and their inline caches are isolated. There is no "mixed" state.
  • Storing a flag in the block would mutate state shared by all Proc instances of that block. Calling original.call would suddenly raise because some other code called with_refinements on a sibling Proc. This is the same action-at-a-distance problem that Proc#using had, which this proposal was redesigned to avoid.

Note that a block can already behave differently depending on how it is invoked (instance_exec changes self, instance variables, and method resolution). with_refinements is similar: an explicit, opt-in re-binding of the resolution context, and the new Proc object makes the boundary visible.

One way to fully address that would be to make this lexical, like:
(snip)
But I guess the use case here wants more flexibility?

As you guessed, the main use case is the opposite direction: a library (e.g. a DSL) applies its own refinements to blocks written by its users, so that users do not need to write using or name the modules. A lexical form cannot express this, because the modules are chosen by the library that receives the block, not by the author of the block.

For the case where the author of the block names the modules, a convenience method close to your proc_using_refinements can be built on top of the primitive:

module Kernel
  private def using_refinements(*modules, &block)
    block.with_refinements(*modules).call
  end
end

using_refinements(StringExt) { "hi".shout }  #=> "HI!"

I think adding such a helper is fine (the name is open to discussion). But I want Proc#with_refinements as the primitive, because the case where a library receives a block cannot be expressed lexically.

Updated by shugo (Shugo Maeda) about 12 hours ago Actions #10 [ruby-core:125760]

  • Description updated (diff)

Now Proc#with_refinements can be called in non-main Ractors:
https://github.com/ruby/ruby/pull/17248/changes/2cddb3fd99a7c68b4ab478f497f2610206edf335

Memo access is synchronized with RB_VM_LOCKING().
In single-Ractor mode it takes no actual lock (it is gated on rb_multi_ractor_p()), so the overhead is negligible:

call 10000 original blocks                             23.5 ms  (2.35 us/block)
with_refinements x10000 (first time: copy)            260.5 ms  (26.05 us/block)
call 10000 refined blocks                              23.5 ms  (2.35 us/block)
with_refinements x10000 (memoized)                      3.1 ms  (0.31 us/block)
with_refinements x10000, alternating A/B              246.1 ms  (24.61 us/block)
iseq tree size: original 4040 bytes, copy 3992 bytes
Actions

Also available in: PDF Atom