Feature #22097
openAdd Proc#with_refinements
Description
Abstract¶
I propose Proc#with_refinements(mod, ...) to support block-level refinements.
module StringExt
refine String do
def shout = upcase + "!"
end
end
original = ->(s) { s.shout }
refined = original.with_refinements(StringExt)
p refined.call("hello") # "HELLO!"
p original.call("hello") # NoMethodError
When no argument is given, ArgumentError is raised.
When a non-Module argument is given, TypeError is raised.
Background and Motivation¶
I previously proposed Proc#using in [Feature #16461], but it introduced semantic complexities because it mutated existing blocks.
Instead of mutating the existing block, Proc#with_refinements returns a new Proc object with its own isolated call sites.
This approach makes its semantics much simpler than Proc#using, and it avoids thread-safety issues and plays nicely with inline caches.
Use Cases¶
Proc#with_refinements is useful to implement internal DSLs applying refinements implicitly.
- activerecord-refinements
- DSL for SQL
- Example:
User.where { :name == 'matz' }
- radd_djur
- DSL for monadic parser combinators
Limitations¶
- Similar to
Proc#binding,Proc#with_refinementsraisesArgumentErrorif the
receiver is not created from a Ruby block.
- Chained application of
Proc#with_refinementsis not allowed.ArgumentErroris
raised if the receiver is aProcreturned byProc#with_refinements.
define_method(anddefine_singleton_method) rejects aProcwith refinements.
ArgumentErroris raised if the return value ofProc#with_refinementsis given to
define_method.
Implementation¶
I've opened a pull request: https://github.com/ruby/ruby/pull/17248
A PoC for JRuby is also available at: https://github.com/jruby/jruby/pull/9486
Data structure changes¶
- Added a bit field
has_refinementstorb_proc_t. - Added a hidden instance variable to
Procto store acrefwith the applied refinements. - Added a single-entry cache
refinement_memotorb_iseq_constant_body.
Deep copy of iseq and caching¶
Proc#with_refinements performs a deep copy of the receiver's iseq to isolate its call sites from the original Proc.
While a deep copy can be an expensive operation, the single-entry cache in rb_iseq_constant_body mitigates this overhead effectively for most practical use cases where the same refinements are applied repeatedly.
Overhead for code not using Proc#with_refinements¶
- Memory footprint: Neither internal structure grows in size.
has_refinementsis a 1-bit field added to rb_proc_t's existing bit field, andrefinement_memoshares a union withmandatory_only_iseqin rb_iseq_constant_body. - Execution speed: The common
Proc#callpath is kept frameless and only adds a singlehas_refinementsbit check. - GC: The mark/free/memsize functions add a single branch per
iseqto select the union member.
Benchmark results: https://gist.github.com/shugo/ddfe92f28ea31e6527a2f270e6daee7c
Here's an excerpt from the results, where compare-ruby is master and built-ruby is the branch for this feature (focusing on Proc/Block operations):
| compare-ruby | built-ruby | |
|---|---|---|
| vm_proc | 47.215M | 46.149M |
| 1.02x | - | |
| vm_yield | 1.649 | 1.754 |
| - | 1.06x |
Updated by shugo (Shugo Maeda) 26 days ago
- Related to Feature #16461: Proc#using added
- Related to Feature #12086: using: option for instance_eval etc. added
Updated by headius (Charles Nutter) 26 days ago
Thank you for considering JRuby! I will review your PR and also start a ruby-4.1 branch you can target.
Updated by shugo (Shugo Maeda) 25 days ago
- Description updated (diff)
Updated by shugo (Shugo Maeda) 25 days ago
headius (Charles Nutter) wrote in #note-2:
Thank you for considering JRuby! I will review your PR and also start a ruby-4.1 branch you can target.
Thank you!
I've opened a new pull request at: https://github.com/jruby/jruby/pull/9486
Updated by shugo (Shugo Maeda) 25 days ago
- Description updated (diff)
Updated by shugo (Shugo Maeda) 25 days ago
- Description updated (diff)
Updated by shugo (Shugo Maeda) 24 days ago
· Edited
For maintainability, I've replaced the hand-written iseq deep-copy with an in-memory IBF dump+load round-trip in: https://github.com/ruby/ruby/pull/17248/changes/21b977614f0e821192e9514bb76230db9c0292df
Updated by Eregon (Benoit Daloze) 24 days ago
· Edited
Since the performance relies on having with_refinements called always with the same Refinement module for a given block, how about raising an exception if it doesn't hold?
Then we effectively have a guarantee vs very slow performance for e.g. loop { original.with_refinements(A); original.with_refinements(B) } (silly example, but could happen naturally in a bigger app).
Semantically, nested blocks also get access to the refinements, as shown in test_with_refinements_nested_block, or for clarity:
module StringExt
refine String do
def shout = upcase + "!"
end
end
original = ->(s) { -> { s.shout }.call }
refined = original.with_refinements(StringExt)
p refined.call("hello") # "HELLO!"
This is what I would expect, just I didn't see that in the description.
Copying a block IR's, and the IR of all nested blocks (IR = bytecode for CRuby) is quite expensive.
It's cached but it's still going to be a significant cost on either application startup/on the first request/etc.
It would be good to get some numbers on that, e.g. creating and calling N blocks vs the same but also using with_refinements.
The increased memory footprint would be worth documenting.
Semantically, this means a given block (the lexical construct) can behave significantly differently based on calls to the original Proc or the with_refinements Proc.
It's a bit like a given block being both a lambda and a proc, that's confusing and generally forbidden (except send(rand < 0.5 ? :lambda : :proc) { ... } but that's obvious; lambda(&b) is forbidden for this reason).
Or similar to the issues we had with Ractor.make_shareable (which we solved by making the semantics much more similar and error if it would be too different).
In summary: observable different semantics for the same block is always surprising, because hard to explain and to debug.
IOW, it can break the author of the block's intention, by changing what a given piece of Ruby code means.
I suppose the general expectation here is only the refined block is called and the original block is never called.
If that holds I think it's fine, the problem is how to make it hold?
To make the semantics cleaner, maybe we should prevent the original block to be called (i.e. raise an exception if it's called) once with_refinements has been called on it?
(note: this would be stored in the block, so for all Proc instances of that block)
One might still call the original block, then use with_refinements and observe the mixed semantics but that becomes a much narrower case.
One way to fully address that would be to make this lexical, like:
and error if proc_using_refinements is not called with a literal block.
Or maybe tweak the lambda operator like e.g.:
But I guess the use case here wants more flexibility?
Updated by shugo (Shugo Maeda) 23 days ago
· Edited
Thank you for the feedback!
Eregon (Benoit Daloze) wrote in #note-8:
Since the performance relies on having
with_refinementscalled always with the same Refinement module for a given block, how about raising an exception if it doesn't hold?
I would prefer not to. The memo is just a cache, and this restriction would make the cache observable: whether prc.with_refinements(B) succeeds would depend on whether some other code called it with A before. For example, two libraries applying different refinements to Procs created from the same block would conflict, and the failure would depend on call order. That seems harder to debug than the performance issue it prevents.
Instead, how about emitting a performance warning (Warning[:performance], like the object shapes warnings) when with_refinements discards the cached copy because it was called with different modules for the same block? That makes the performance issue visible without changing the semantics.
Semantically, nested blocks also get access to the refinements, as shown in
test_with_refinements_nested_block, or for clarity:
Yes, nested blocks also see the refinements. This is intended behavior. The refinements also apply to methods defined with def inside the body. I have documented this in the RDoc in https://github.com/ruby/ruby/pull/17248/changes/bcca2d04f4c959850a43be5c44e2d272341e11d4
Copying a block IR's, and the IR of all nested blocks (IR = bytecode for CRuby) is quite expensive.
It's cached but it's still going to be a significant cost on either application startup/on the first request/etc.
It would be good to get some numbers on that, e.g. creating and calling N blocks vs the same but also usingwith_refinements.
That makes sense.
Here are some numbers, with 10,000 distinct blocks of a realistic size (about 15 lines, 2 nested blocks, so each copy duplicates 3 iseqs):
call 10000 original blocks 27.2 ms (2.72 us/block)
with_refinements x10000 (first time: copy) 261.8 ms (26.18 us/block)
call 10000 refined blocks 27.6 ms (2.76 us/block)
with_refinements x10000 (memoized) 3.2 ms (0.32 us/block)
with_refinements x10000, alternating A/B 251.8 ms (25.18 us/block)
iseq tree size: original 4040 bytes, copy 3992 bytes
So the copy costs about 26 us and 4 KB per block per module set, and it happens only once thanks to the memoization. Even 10,000 refined blocks add only ~0.3 seconds to startup. Call speed is the same as the original.
The full script is at: https://gist.github.com/shugo/07e62c44bc4765ecff6d2b8e704b5f38
The increased memory footprint would be worth documenting.
I have documented the memory footprint in the RDoc in https://github.com/ruby/ruby/pull/17248/changes/bcca2d04f4c959850a43be5c44e2d272341e11d4
To make the semantics cleaner, maybe we should prevent the original block to be called (i.e. raise an exception if it's called) once
with_refinementshas been called on it?
In the intended use cases, only the refined Proc is called. But I would prefer not to enforce it, for two reasons:
- Calling both is well-defined: each Proc behaves consistently, and their inline caches are isolated. There is no "mixed" state.
- Storing a flag in the block would mutate state shared by all Proc instances of that block. Calling
original.callwould suddenly raise because some other code calledwith_refinementson a sibling Proc. This is the same action-at-a-distance problem thatProc#usinghad, which this proposal was redesigned to avoid.
Note that a block can already behave differently depending on how it is invoked (instance_exec changes self, instance variables, and method resolution). with_refinements is similar: an explicit, opt-in re-binding of the resolution context, and the new Proc object makes the boundary visible.
One way to fully address that would be to make this lexical, like:
(snip)
But I guess the use case here wants more flexibility?
As you guessed, the main use case is the opposite direction: a library (e.g. a DSL) applies its own refinements to blocks written by its users, so that users do not need to write using or name the modules. A lexical form cannot express this, because the modules are chosen by the library that receives the block, not by the author of the block.
For the case where the author of the block names the modules, a convenience method close to your proc_using_refinements can be built on top of the primitive:
module Kernel
private def using_refinements(*modules, &block)
block.with_refinements(*modules).call
end
end
using_refinements(StringExt) { "hi".shout } #=> "HI!"
I think adding such a helper is fine (the name is open to discussion). But I want Proc#with_refinements as the primitive, because the case where a library receives a block cannot be expressed lexically.
Updated by shugo (Shugo Maeda) 21 days ago
· Edited
- Description updated (diff)
Now Proc#with_refinements can be called in non-main Ractors:
https://github.com/ruby/ruby/pull/17248/changes/4db090a95939e92189bfcdfe096777950eb0b869
Memo access is synchronized with RB_VM_LOCKING().
In single-Ractor mode it takes no actual lock (it is gated on rb_multi_ractor_p()), so the overhead is negligible:
call 10000 original blocks 23.5 ms (2.35 us/block)
with_refinements x10000 (first time: copy) 260.5 ms (26.05 us/block)
call 10000 refined blocks 23.5 ms (2.35 us/block)
with_refinements x10000 (memoized) 3.1 ms (0.31 us/block)
with_refinements x10000, alternating A/B 246.1 ms (24.61 us/block)
iseq tree size: original 4040 bytes, copy 3992 bytes
Updated by shugo (Shugo Maeda) 18 days ago
- Description updated (diff)
Updated by shugo (Shugo Maeda) 18 days ago
- Description updated (diff)
Updated by shugo (Shugo Maeda) 18 days ago
- Related to Feature #12281: Allow lexically scoped use of refinements with `using {}` block syntax added
Updated by shugo (Shugo Maeda) 17 days ago
- Description updated (diff)
Updated by shugo (Shugo Maeda) 17 days ago
- Description updated (diff)
Updated by Eregon (Benoit Daloze) 13 days ago
· Edited
Thank you for your replies, they address most of my concerns and show this was well though out.
Yes, I think a performance warning is fine and important to add, nothing would be a silent performance trap.
shugo (Shugo Maeda) wrote in #note-9:
The refinements also apply to methods defined with
definside the body
def is usually a lexical boundary but not for refinements (i.e., using), so I guess it makes some sense.
I'm not sure it's really needed though and might be surprising, is it needed?
I think for examples like
proc {
Foo.class_eval do
def bar
"hi".shout
end
end
Foo.new.bar
}.with_refinements(StringRefinement)
it's quite strange and seem safer to not support that.
This seems confusing, it looks like mixing refine and using together, even though it's only using and so defining a global method with some refinements applied.
module/class would still be boundaries, right?
Then the above would work but not this?
OTOH, I suppose Proc#with_refinements is basically doing the same as if the proc had using M on the first line, and the refinements wouldn't apply past the end of the proc.
But I think going through def or module/class is the action-at-a-distance problem again,
e.g. if the user defines some method inside a block (which at the place the block is declared is not clear at all it has refinements) then it unexpectedly get refinements too.
I think we should limit it to nested blocks, those already inherit the local variables so it's natural.
define_method would also not use those refinements.
Updated by shugo (Shugo Maeda) 11 days ago
· Edited
Thank you for your feedback, and I'm glad to hear your concerns are mostly addressed.
Eregon (Benoit Daloze) wrote in #note-16:
I'm not sure it's really needed though and might be surprising, is it needed?
with_refinements is semantically equivalent to having using M at the beginning of the proc body. With using, refinements are active through class/module/def -- this is the existing, documented behavior (see doc/syntax/refinements.rdoc). Keeping with_refinements consistent with using is a deliberate design choice: users who understand using should be able to predict with_refinements behavior without learning new scoping rules.
I also don't think this is action-at-a-distance. The action-at-a-distance I described in #note-9 was about mutating shared state: calling with_refinements(A) would cause a separate call to original.call to raise, depending on call order. That is a non-local, unpredictable side effect. In contrast, refinements applying inside def within the proc body are deterministic and local -- the def is lexically inside the scope where refinements are active, so its behavior is fully determined by the lexical structure, not by runtime call order.
module/classwould still be boundaries, right?
No, module/class are not boundaries for refinements either. With using, refinements apply through class/module/def:
I have also fixed the JRuby implementation to apply refinements inside class/module bodies in: https://github.com/jruby/jruby/pull/9486/commits/f3b31e935241a44faa49c4a345f0625834018ef9
define_methodwould also not use those refinements.
Calling define_method inside a proc body with refinements is the same as calling it under using -- the block passed to define_method sees the refinements from the surrounding lexical scope:
The method defined this way retains the refinements after the proc returns, because they are captured in the block's iseq at definition time -- exactly as with using.
Note that this is different from passing a with_refinements-applied Proc directly to define_method, which raises ArgumentError as documented in Limitations. The distinction is: calling define_method inside a refined proc body is normal using behavior; passing a refined Proc as the method body is not supported.
I think we should limit it to nested blocks
Restricting with_refinements to blocks only, excluding def/class/module, would make it inconsistent with using without a clear safety benefit -- the refined behavior is still deterministic and lexically scoped in all cases.
It would also hurt usability. For example, as mentioned in #note-9, a convenience helper can be built on top of with_refinements:
If refinements were limited to nested blocks, this natural pattern would not work.
I don't think we should introduce restrictions that using doesn't have.