Bug #18826
openSymbol#to_proc inconsistent, sometimes calls private methods
Description
The following usage calls a protected method and prints "hello":
class Test
protected
def referenced_columns
puts "hello"
end
end
Test.new.tap(&:referenced_columns)
However, the following usage results in a NoMethodError:
class Integer
private
def foo
42
end
end
(1..4).collect(&:foo)
It seems to be a bug that tap calls a private method. It is also inconsistent with collect not calling private methods.
Updated by jeremyevans0 (Jeremy Evans) 11 days ago
This appears to be caused by the use of the VM_FCALL
flag in vm_call_symbol
. Since :foo.to_proc
should be treated as lambda{|t| t.foo}
, not lambda{|t| t.send(:foo)}
, I agree that this is a bug.
Fixing this for private methods is not too difficult, by adding a flags
argument to vm_call_symbol
, and only using VM_CALL_FCALL
for the vm_call_opt_send
case and not the vm_invoke_symbol_block
case.
For protected methods, it's more challenging, since calling tap
pushes a new VM frame where cfp->self
is the receiver of tap
, and the method called by Symbol#to_proc
will always be called on that receiver, so protected methods would not raise exceptions. Hopefully @nobu (Nobuyoshi Nakada) can figure out how best to handle the protected method case.
With your second example, if you switch private
to protected
, you get a NoMethodError
, which makes sense, since you have an a range receiver (1..4
). calling an integer method (foo
). For cases where the receiver matches, both tap
and collect
appear to allow protected methods:
class Array
protected
def foo
first
end
end
a = [[1], [2], [3]]
p a.tap(&:foo)
p a.collect(&:foo)
So it appears to that :foo.to_proc
should be handled the same as lambda{|t| t.foo}
in regards to protected methods, unless we want to break backwards compatibility.
Here's a fix for the private method case:
diff --git a/vm_insnhelper.c b/vm_insnhelper.c
index 09fcd2f729..33daada3cf 100644
--- a/vm_insnhelper.c
+++ b/vm_insnhelper.c
@@ -3201,7 +3201,7 @@ ci_missing_reason(const struct rb_callinfo *ci)
static VALUE
vm_call_symbol(rb_execution_context_t *ec, rb_control_frame_t *reg_cfp,
- struct rb_calling_info *calling, const struct rb_callinfo *ci, VALUE symbol)
+ struct rb_calling_info *calling, const struct rb_callinfo *ci, VALUE symbol, int flags)
{
ASSUME(calling->argc >= 0);
/* Also assumes CALLER_SETUP_ARG is already done. */
@@ -3211,9 +3211,7 @@ vm_call_symbol(rb_execution_context_t *ec, rb_control_frame_t *reg_cfp,
VALUE recv = calling->recv;
VALUE klass = CLASS_OF(recv);
ID mid = rb_check_id(&symbol);
- int flags = VM_CALL_FCALL |
- VM_CALL_OPT_SEND |
- (calling->kw_splat ? VM_CALL_KW_SPLAT : 0);
+ flags |= VM_CALL_OPT_SEND | (calling->kw_splat ? VM_CALL_KW_SPLAT : 0);
if (UNLIKELY(! mid)) {
mid = idMethodMissing;
@@ -3300,7 +3298,7 @@ vm_call_opt_send(rb_execution_context_t *ec, rb_control_frame_t *reg_cfp, struct
calling->argc -= 1;
DEC_SP(1);
- return vm_call_symbol(ec, reg_cfp, calling, calling->ci, sym);
+ return vm_call_symbol(ec, reg_cfp, calling, calling->ci, sym, VM_CALL_FCALL);
}
}
@@ -4104,7 +4102,7 @@ vm_invoke_symbol_block(rb_execution_context_t *ec, rb_control_frame_t *reg_cfp,
VALUE symbol = VM_BH_TO_SYMBOL(block_handler);
CALLER_SETUP_ARG(reg_cfp, calling, ci);
calling->recv = TOPN(--calling->argc);
- return vm_call_symbol(ec, reg_cfp, calling, ci, symbol);
+ return vm_call_symbol(ec, reg_cfp, calling, ci, symbol, 0);
}
}
Updated by Eregon (Benoit Daloze) 11 days ago
I think Symbol#to_proc should behave like public_send
, i.e., only allows calling public methods.
That makes sense given we don't really have a call-site where we can assess the caller self easily, especially if converted to something like lambda{|t| t.foo}
(i.e., once, in one place, on the Symbol#to_proc
call, not on every Proc#call
of that Proc).
This is what TruffleRuby currently implements and it seems to work well and match user expectations (we've only seen one case in Rails which called a protected method, and that code has been replaced already).
So for the protected example:
class Array
protected def foo
first
end
end
a = [[1], [2], [3]]
p a.tap(&:foo) # IMHO should be the same as a.tap { |a2| a2.foo } (which is NoMethodError)
p a.collect(&:foo) # IMHO should be the same as a.collect { |o| o.foo } (which is NoMethodError)
Semantically, the self
of the Proc from Symbol#to_proc
, must be either the self
at the time Symbol#to_proc
is called, i.e., the main object here, or a dummy value like nil
or maybe the Symbol itself (to signify the Proc does not actually capture the outer scope).
But the :foo.to_proc.binding.receiver
shouldn't dynamically change when calling that Proc with Proc#call
(CRuby raises on :foo.to_proc.binding
but let's think conceptually behind that implementation detail).
The receiver should be fixed for Symbol#to_proc
procs, just like for every other Proc instance.
So I think the best and most sane (i.e., natural and does not depend on a specific implementation strategy for Symbol#to_proc
) semantics would be to only allow calling public methods with Symbol#to_proc
.
If that's deemed too incompatible on CRuby let's at least fix the issue that Symbol#to_proc
can call private methods sometimes.
Updated by Eregon (Benoit Daloze) 11 days ago
Also worth noting that for performance purposes, Symbol#to_proc
should be able to be cached globally, i.e., cache the Proc in the/per Symbol instance.
So then the self
inside that Proc should be a sentinel value (nil
or the Symbol itself).
Updated by jeremyevans0 (Jeremy Evans) 10 days ago
I agree with the @Eregon (Benoit Daloze) 's analysis, and submitted a pull request that makes Symbol#to_proc return a proc that will only call public methods: https://github.com/ruby/ruby/pull/6018 . I'm not happy with the implementation as it copies and modifies the code for vm_call_method
, but I don't know a better way to fix it.
This does cause a failure in rubyspec (which the pull request fixes), but I think that's due to the spec implementation (def at top-level defines private methods), and not by design. I'm guessing this will cause backwards compatibility issues, though hopefully they will be small, as found by TruffleRuby. I don't think it would be a good idea to backport the fix.
Updated by matz (Yukihiro Matsumoto) 9 days ago
In general, a.b(&:c)
should behave exactly the same as a.b{_1.c}
. But visibility check for protected
methods may be too difficult for to_proc
. So rejecting protected
methods altogether like private
methods is acceptable.
Matz.