Project

General

Profile

Actions

Bug #18826

closed

Symbol#to_proc inconsistent, sometimes calls private methods

Added by bjfish (Brandon Fish) 8 months ago. Updated 6 months ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Target version:
-
[ruby-core:108882]

Description

The following usage calls a protected method and prints "hello":

class Test
    protected

    def referenced_columns
        puts "hello"
    end
end      
Test.new.tap(&:referenced_columns)

However, the following usage results in a NoMethodError:

class Integer
    private
    def foo
      42
    end
  end

(1..4).collect(&:foo)

It seems to be a bug that tap calls a private method. It is also inconsistent with collect not calling private methods.

Updated by jeremyevans0 (Jeremy Evans) 8 months ago

This appears to be caused by the use of the VM_FCALL flag in vm_call_symbol. Since :foo.to_proc should be treated as lambda{|t| t.foo}, not lambda{|t| t.send(:foo)}, I agree that this is a bug.

Fixing this for private methods is not too difficult, by adding a flags argument to vm_call_symbol, and only using VM_CALL_FCALL for the vm_call_opt_send case and not the vm_invoke_symbol_block case.

For protected methods, it's more challenging, since calling tap pushes a new VM frame where cfp->self is the receiver of tap, and the method called by Symbol#to_proc will always be called on that receiver, so protected methods would not raise exceptions. Hopefully @nobu (Nobuyoshi Nakada) can figure out how best to handle the protected method case.

With your second example, if you switch private to protected, you get a NoMethodError, which makes sense, since you have an a range receiver (1..4). calling an integer method (foo). For cases where the receiver matches, both tap and collect appear to allow protected methods:

class Array
  protected
  def foo
    first
  end
end

a = [[1], [2], [3]]
p a.tap(&:foo)
p a.collect(&:foo)

So it appears to that :foo.to_proc should be handled the same as lambda{|t| t.foo} in regards to protected methods, unless we want to break backwards compatibility.

Here's a fix for the private method case:

diff --git a/vm_insnhelper.c b/vm_insnhelper.c
index 09fcd2f729..33daada3cf 100644
--- a/vm_insnhelper.c
+++ b/vm_insnhelper.c
@@ -3201,7 +3201,7 @@ ci_missing_reason(const struct rb_callinfo *ci)

 static VALUE
 vm_call_symbol(rb_execution_context_t *ec, rb_control_frame_t *reg_cfp,
-               struct rb_calling_info *calling, const struct rb_callinfo *ci, VALUE symbol)
+               struct rb_calling_info *calling, const struct rb_callinfo *ci, VALUE symbol, int flags)
 {
     ASSUME(calling->argc >= 0);
     /* Also assumes CALLER_SETUP_ARG is already done. */
@@ -3211,9 +3211,7 @@ vm_call_symbol(rb_execution_context_t *ec, rb_control_frame_t *reg_cfp,
     VALUE recv = calling->recv;
     VALUE klass = CLASS_OF(recv);
     ID mid = rb_check_id(&symbol);
-    int flags = VM_CALL_FCALL |
-                VM_CALL_OPT_SEND |
-                (calling->kw_splat ? VM_CALL_KW_SPLAT : 0);
+    flags |= VM_CALL_OPT_SEND | (calling->kw_splat ? VM_CALL_KW_SPLAT : 0);

     if (UNLIKELY(! mid)) {
         mid = idMethodMissing;
@@ -3300,7 +3298,7 @@ vm_call_opt_send(rb_execution_context_t *ec, rb_control_frame_t *reg_cfp, struct
        calling->argc -= 1;
        DEC_SP(1);

-        return vm_call_symbol(ec, reg_cfp, calling, calling->ci, sym);
+        return vm_call_symbol(ec, reg_cfp, calling, calling->ci, sym, VM_CALL_FCALL);
     }
 }

@@ -4104,7 +4102,7 @@ vm_invoke_symbol_block(rb_execution_context_t *ec, rb_control_frame_t *reg_cfp,
         VALUE symbol = VM_BH_TO_SYMBOL(block_handler);
         CALLER_SETUP_ARG(reg_cfp, calling, ci);
         calling->recv = TOPN(--calling->argc);
-        return vm_call_symbol(ec, reg_cfp, calling, ci, symbol);
+        return vm_call_symbol(ec, reg_cfp, calling, ci, symbol, 0);
     }
 }

Updated by Eregon (Benoit Daloze) 8 months ago

I think Symbol#to_proc should behave like public_send, i.e., only allows calling public methods.
That makes sense given we don't really have a call-site where we can assess the caller self easily, especially if converted to something like lambda{|t| t.foo} (i.e., once, in one place, on the Symbol#to_proc call, not on every Proc#call of that Proc).
This is what TruffleRuby currently implements and it seems to work well and match user expectations (we've only seen one case in Rails which called a protected method, and that code has been replaced already).

So for the protected example:

class Array
  protected def foo
    first
  end
end
a = [[1], [2], [3]]
p a.tap(&:foo) # IMHO should be the same as a.tap { |a2| a2.foo } (which is NoMethodError)
p a.collect(&:foo) # IMHO should be the same as a.collect { |o| o.foo } (which is NoMethodError)

Semantically, the self of the Proc from Symbol#to_proc, must be either the self at the time Symbol#to_proc is called, i.e., the main object here, or a dummy value like nil or maybe the Symbol itself (to signify the Proc does not actually capture the outer scope).
But the :foo.to_proc.binding.receiver shouldn't dynamically change when calling that Proc with Proc#call (CRuby raises on :foo.to_proc.binding but let's think conceptually behind that implementation detail).
The receiver should be fixed for Symbol#to_proc procs, just like for every other Proc instance.

So I think the best and most sane (i.e., natural and does not depend on a specific implementation strategy for Symbol#to_proc) semantics would be to only allow calling public methods with Symbol#to_proc.
If that's deemed too incompatible on CRuby let's at least fix the issue that Symbol#to_proc can call private methods sometimes.

Updated by Eregon (Benoit Daloze) 8 months ago

Also worth noting that for performance purposes, Symbol#to_proc should be able to be cached globally, i.e., cache the Proc in the/per Symbol instance.
So then the self inside that Proc should be a sentinel value (nil or the Symbol itself).

Updated by jeremyevans0 (Jeremy Evans) 8 months ago

I agree with the @Eregon (Benoit Daloze) 's analysis, and submitted a pull request that makes Symbol#to_proc return a proc that will only call public methods: https://github.com/ruby/ruby/pull/6018 . I'm not happy with the implementation as it copies and modifies the code for vm_call_method, but I don't know a better way to fix it.

This does cause a failure in rubyspec (which the pull request fixes), but I think that's due to the spec implementation (def at top-level defines private methods), and not by design. I'm guessing this will cause backwards compatibility issues, though hopefully they will be small, as found by TruffleRuby. I don't think it would be a good idea to backport the fix.

Updated by matz (Yukihiro Matsumoto) 8 months ago

In general, a.b(&:c) should behave exactly the same as a.b{_1.c}. But visibility check for protected methods may be too difficult for to_proc. So rejecting protected methods altogether like private methods is acceptable.

Matz.

Actions #6

Updated by jeremyevans (Jeremy Evans) 6 months ago

  • Status changed from Open to Closed

Applied in changeset git|bfa6a8ddc84fffe0aef5a0f91b417167e124dbbf.


Only allow procs created by Symbol#to_proc to call public methods

Fixes [Bug #18826]

Co-authored-by: Nobuyoshi Nakada

Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0Like0Like0