Bug #19165
openMethod (with no param) delegation with *, **, and ... is slow
Description
I found that method delegation via Forwardable is much slower than normal method call when delegating a method that does not take parameters.
Here's a benchmark that explains what I mean.
require 'forwardable'
require 'pp'
require 'benchmark/ips'
class Obj
extend Forwardable
attr_accessor :other
def initialize
@other = Other.new
end
def foo_without_splat
@other.foo
end
def foo_with_splat(*)
@other.foo(*)
end
def foo_with_splat_with_name(*args)
@other.foo(*args)
end
def foo_with_splat_and_double_splat(*, **)
@other.foo(*, **)
end
def foo_with_triple_dots(...)
@other.foo(...)
end
delegate :foo => :@other
end
class Other
def foo() end
end
o = Obj.new
Benchmark.ips do |x|
x.report 'simple call' do
o.other.foo
end
x.report 'delegate without splat' do
o.foo_without_splat
end
x.report 'delegate with splat' do
o.foo_with_splat
end
x.report 'delegate with splat with name' do
o.foo_with_splat_with_name
end
x.report 'delegate with splat and double splat' do
o.foo_with_splat_and_double_splat
end
x.report 'delegate with triple dots' do
o.foo_with_triple_dots
end
x.report 'delegate via forwardable' do
o.foo
end
end
(result)
simple call 38.918M (± 0.9%) i/s - 194.884M
delegate without splat
31.933M (± 1.6%) i/s - 159.611M
delegate with splat 10.269M (± 1.6%) i/s - 51.631M
delegate with splat with name
9.888M (± 1.0%) i/s - 49.588M
delegate with splat and double splat
4.117M (± 0.9%) i/s - 20.696M
delegate with triple dots
4.169M (± 0.9%) i/s - 20.857M
delegate via forwardable
9.204M (± 2.1%) i/s - 46.295M
It shows that Method delegation with a splat is 3-4 times slower (regardless of whether the parameter is named or not), and delegation with a triple-dot literal is 9-10 times slower than a method delegation without an argument.
This may be because calling a method taking a splat always assigns an Array object even when no actual argument was given, and calling a method taking triple-dots assigns five Array objects and two Hash objects (this is equivalent to *, **
).
Are there any chance reducing these object assignments and making them faster? My concern is that the Rails framework heavily uses this kind of method delegations, and presumably it causes unignorable performance overhead.
Updated by matsuda (Akira Matsuda) about 2 years ago
FYI for confirming "five Array objects and two Hash objects" that I wrote above, I used ko1's allocation_tracer as follows:
require 'allocation_tracer'
ObjectSpace::AllocationTracer.setup([:type])
o.foo_with_triple_dots
pp ObjectSpace::AllocationTracer.trace {
o.foo_with_triple_dots
}
Updated by Eregon (Benoit Daloze) about 2 years ago
How many allocations is it with ...
when not using forwardable but just delegation with (...)
? I'd think 1 Array + 1 Hash.
Updated by shugo (Shugo Maeda) about 2 years ago
It seems that ...
is faster without [Feature #19134]:
simple call 13.250M (± 2.0%) i/s - 66.792M in 5.043180s
delegate without splat
12.523M (± 1.3%) i/s - 62.863M in 5.020866s
delegate with splat 6.231M (± 1.8%) i/s - 31.452M in 5.049532s
delegate with splat with name
6.152M (± 3.3%) i/s - 30.958M in 5.038120s
delegate with splat and double splat
2.187M (± 2.0%) i/s - 10.981M in 5.024101s
delegate with triple dots
5.976M (± 1.6%) i/s - 30.120M in 5.041456s
delegate via forwardable
5.072M (± 1.4%) i/s - 25.818M in 5.091690s
args = arg_append(p, args, new_hash(p, kwrest, loc), loc);
in the following code seems to be slow.
static NODE *
new_args_forward_call(struct parser_params *p, NODE *leading, const YYLTYPE *loc, const YYLTYPE *argsloc)
{
NODE *rest = NEW_LVAR(idFWD_REST, loc);
NODE *kwrest = list_append(p, NEW_LIST(0, loc), NEW_LVAR(idFWD_KWREST, loc));
NODE *block = NEW_BLOCK_PASS(NEW_LVAR(idFWD_BLOCK, loc), loc);
NODE *args = leading ? rest_arg_append(p, leading, rest, loc) : NEW_SPLAT(rest, loc);
args = arg_append(p, args, new_hash(p, kwrest, loc), loc);
return arg_blk_pass(args, block);
}
Should we revert [Feature #19134]?
Updated by Eregon (Benoit Daloze) about 2 years ago
- Related to Feature #19134: ** is not allowed in def foo(...) added
Updated by ko1 (Koichi Sasada) about 2 years ago
I made a patch https://github.com/ruby/ruby/pull/6920
This patch improves the performance of the cases which are discussed on this ticket.
However, this patch changes the calling convention and YJIT and MJIT are needed to catch up.
I'm not sure we can do it in some days.
Updated by tenderlovemaking (Aaron Patterson) 4 days ago
I reran this benchmark with Ruby 3.5. I think most numbers have improved:
$ ruby test.rb
ruby 3.5.0dev (2025-01-16T16:20:06Z master d05f6a9b8f) +PRISM [arm64-darwin24]
Warming up --------------------------------------
simple call 3.459M i/100ms
delegate without splat
2.703M i/100ms
delegate with splat 1.343M i/100ms
delegate with splat with name
1.347M i/100ms
delegate with splat and double splat
1.301M i/100ms
delegate with triple dots
2.109M i/100ms
delegate via forwardable
1.125M i/100ms
Calculating -------------------------------------
simple call 34.395M (± 0.4%) i/s (29.07 ns/i) - 172.947M in 5.028381s
delegate without splat
27.450M (± 1.5%) i/s (36.43 ns/i) - 137.830M in 5.022175s
delegate with splat 13.360M (± 0.5%) i/s (74.85 ns/i) - 67.155M in 5.026767s
delegate with splat with name
13.662M (± 0.3%) i/s (73.20 ns/i) - 68.678M in 5.027114s
delegate with splat and double splat
13.717M (± 0.5%) i/s (72.90 ns/i) - 68.952M in 5.026767s
delegate with triple dots
20.958M (± 0.9%) i/s (47.71 ns/i) - 105.464M in 5.032448s
delegate via forwardable
11.273M (± 0.7%) i/s (88.71 ns/i) - 57.368M in 5.089149s
Comparison:
simple call: 34394553.5 i/s
delegate without splat: 27450367.6 i/s - 1.25x slower
delegate with triple dots: 20958426.8 i/s - 1.64x slower
delegate with splat and double splat: 13717235.5 i/s - 2.51x slower
delegate with splat with name: 13661705.2 i/s - 2.52x slower
delegate with splat: 13359936.0 i/s - 2.57x slower
delegate via forwardable: 11273246.8 i/s - 3.05x slower
I think we could probably do something to speed up anonymous *
and anonymous **
, but I'm not sure why use those and not ...
. I guess there is a reason, but I also bet most cases in Rails that use anonymous *
and **
can be changed to use ...
without any impact to behavior.
Updated by jeremyevans0 (Jeremy Evans) 4 days ago
tenderlovemaking (Aaron Patterson) wrote in #note-6:
I think we could probably do something to speed up anonymous
*
and anonymous**
, but I'm not sure why use those and not...
. I guess there is a reason, but I also bet most cases in Rails that use anonymous*
and**
can be changed to use...
without any impact to behavior.
tldr; ...
is useful if you are forward everything, *
, **
, and &
are useful if you want to be selective regarding what you are forwarding.
You can only use ...
if you are forwarding everything other than the preceding positional arguments. There are cases where you need to use *
and **
and cannot use ...
(though they may be less common than cases where you can use ...
):
def a(*, **nil); b(*) end # do not accept keyword arguments
def a(*, c: true) b(*) if c end # only accept specific keyword arguments
def a(**); b(**) end # do not accept positional arguments
def a(c, **); b(**) if c end # only accept specific positional argument
def a(*, **); b(*, **) if yield end # do not forward block
Updated by tenderlovemaking (Aaron Patterson) 4 days ago
jeremyevans0 (Jeremy Evans) wrote in #note-7:
tenderlovemaking (Aaron Patterson) wrote in #note-6:
I think we could probably do something to speed up anonymous
*
and anonymous**
, but I'm not sure why use those and not...
. I guess there is a reason, but I also bet most cases in Rails that use anonymous*
and**
can be changed to use...
without any impact to behavior.tldr;
...
is useful if you are forward everything,*
,**
, and&
are useful if you want to be selective regarding what you are forwarding.You can only use
...
if you are forwarding everything other than the preceding positional arguments. There are cases where you need to use*
and**
and cannot use...
(though they may be less common than cases where you can use...
):def a(*, **nil); b(*) end # do not accept keyword arguments def a(*, c: true) b(*) if c end # only accept specific keyword arguments def a(**); b(**) end # do not accept positional arguments def a(c, **); b(**) if c end # only accept specific positional argument def a(*, **); b(*, **) if yield end # do not forward block
Yes. I think we could probably do a similar trick for anonymous * and ** that we do for ...
, I'm just unsure it's worth the complexity given the comparative popularity 🤷🏻♀️
Updated by jeremyevans0 (Jeremy Evans) 4 days ago
tenderlovemaking (Aaron Patterson) wrote in #note-8:
Yes. I think we could probably do a similar trick for anonymous * and ** that we do for
...
, I'm just unsure it's worth the complexity given the comparative popularity 🤷🏻♀️
I agree. Especially when you consider that **
does not allocate if passed a keyword splat or no keywords, and *
does not allocate if passed a positional argument splat.
Given:
def a(*, **); b(*, **) end
array = []
kw = {}
The generic argument forwarding optimization should speed up the following by reducing allocations (comment shows the allocations saved):
a # 1 array
a(1) # 1 array
a(*array, b: 1) # 1 hash
a(b: 1) # 1 array and 1 hash
The following calls would likely not receive a speedup from the generic argument forwarding optimization, since they would not save an allocation (both of these calls are already allocationless):
a(*array)
a(*array, **kw)