Feature #8088

Method#parameters (and friends) should provide useful information about core methods

Added by Charles Nutter about 1 year ago. Updated 7 months ago.

[ruby-core:53386]
Status:Open
Priority:Normal
Assignee:-
Category:-
Target version:-

Description

I was wiring up #parameters to work for native methods today when I realized MRI doesn't give very good information about variable-arity native methods:

ext-jruby-local ~/projects/jruby $ ruby2.0.0 -e "p ''.method(:gsub).to_proc.parameters"
:rest

ext-jruby-local ~/projects/jruby $ jruby -e "p ''.method(:gsub).to_proc.parameters"
[[:req], [:opt]]

I think MRI should present the same as JRuby here; gsub is obviously not a rest-arg method and you can't call it with less than 1 or more than 2 arguments. JRuby's presenting the right output here.

I'm probably going to have to change JRuby to do the less-helpful version so we're compliant and tests pass, but I think the specification of #parameters should be that it presents the JRuby version about rather than the MRI version.

History

#1 Updated by Marc-Andre Lafortune about 1 year ago

+1.

I plan on proposing a new C API for registering methods, so allow for complete information, including keyword parameters.

#2 Updated by Yorick Peterse about 1 year ago

I would love to see this change as well. Quite a few times already I've
had to retrieve the arguments of a method (and the names in particular)
but with the output being not very trust worthy this has always been a
hit-and-miss.

While we're at it, it would also be nice to include the argument names
if that's possible, though that may be something for a separate feature
request.

Yorick

#3 Updated by Charles Nutter about 1 year ago

marcandre (Marc-Andre Lafortune) wrote:

+1.

I plan on proposing a new C API for registering methods, so allow for complete information, including keyword parameters.

I was contemplating hacking something together myself, in fact. Would like to see what you're proposing...can you add an issue to CommonRuby?

For some background, JRuby's method-registering mechanism is based on Java annotations. For example, String#gsub:

@JRubyMethod(name = "gsub", reads = BACKREF, writes = BACKREF, compat = RUBY1_9)
public IRubyObject gsub19(ThreadContext context, IRubyObject arg0, Block block) {
    return block.isGiven() ? gsubCommon19(context, block, null, null, arg0, false, 0) : enumeratorize(context.runtime, this, "gsub", arg0);
}

@JRubyMethod(name = "gsub", reads = BACKREF, writes = BACKREF, compat = RUBY1_9)
public IRubyObject gsub19(ThreadContext context, IRubyObject arg0, IRubyObject arg1, Block block) {
    return gsub19(context, arg0, arg1, block, false);
}

We register two separate endpoints for the two supported arities. When these methods are bound into String, we have information on required arguments, optional arguments, and whether there's a rest arg. We don't currently include the actual Java local variables in #parameters output, but we easily could.

So I suppose you're thinking about something more like this for MRI?

rbdefinemethodx(rbcString, "gsub", rbstrgsub, 1 /req/, 1 /opt/, 0 /rest/)

It would be super nice if it's possible to get actual C parameters, but I don't think you can do that programmatically (i.e. they'd have to be passed in).

#4 Updated by Marc-Andre Lafortune about 1 year ago

headius (Charles Nutter) wrote:

I was contemplating hacking something together myself, in fact. Would like to see what you're proposing...can you add an issue to CommonRuby?

I could, or maybe you and I can work on something together before submitting it?

So I suppose you're thinking about something more like this for MRI?

rbdefinemethodx(rbcString, "gsub", rbstrgsub, 1 /req/, 1 /opt/, 0 /rest/)

Kind of.

My goals would be to be able to, at least:
- know the minimum and maximum arity (where max can be unlimited). See #5747
- know list of optional and mandatory keyword arguments, presence of keyrest. See #6086
- know if a block can be passed. See #7299

It should also be possible to name the arguments.

This would allow any method(:built_in).parameters to return just about anything we want, like for a Ruby method.

I'm thinking of having something like:

rbdefinemethodx(rbcString, "gsub", rbstrgsub, "pattern, [replacement], [&]");

Although a string means some form of parsing, it also makes the API extensible as well as expressive. In any case, a string is required for named parameters.

Actually, we could even reuse rbdefinemethod, e.g.:

rbdefinemethodx(rbcString, "gsub(pattern, [replacement], [&])", rbstrgsub, 0);

It would be super nice if it's possible to get actual C parameters, but I don't think you can do that programmatically (i.e. they'd have to be passed in).

Sorry, not sure what you mean.

#5 Updated by Charles Nutter about 1 year ago

marcandre (Marc-Andre Lafortune) wrote:

headius (Charles Nutter) wrote:

I was contemplating hacking something together myself, in fact. Would like to see what you're proposing...can you add an issue to CommonRuby?

I could, or maybe you and I can work on something together before submitting it?

I'm game to toss some ideas around. I don't know MRI internals well enough to implement much of it myself :-)

My goals would be to be able to, at least:
- know the minimum and maximum arity (where max can be unlimited). See #5747
- know list of optional and mandatory keyword arguments, presence of keyrest. See #6086
- know if a block can be passed. See #7299

Yup, good. In JRuby blocks can always be passed, but it would be nice to know if a block is required. Method#parameters will probably need some enhancement for that.

I'm thinking of having something like:

rbdefinemethodx(rbcString, "gsub", rbstrgsub, "pattern, [replacement], [&]");

Although a string means some form of parsing, it also makes the API extensible as well as expressive. In any case, a string is required for named parameters.

Actually, we could even reuse rbdefinemethod, e.g.:

rbdefinemethodx(rbcString, "gsub(pattern, [replacement], [&])", rbstrgsub, 0);

Parsing is not unusual in MRI method logic anyway. When you have optional or keyword args, you have to pass in a formatted string that basically has this same info without names. So I don't think adding a format for specifying the argument list is unreasonable.

It would be super nice if it's possible to get actual C parameters, but I don't think you can do that programmatically (i.e. they'd have to be passed in).

Sorry, not sure what you mean.

In JRuby, because we're just marking up Java source for native methods, we can see (in addition to required count, optional count, rest arg present) the names of the arguments, whether the method will do anything with a block (but not whether it's required), and so on...all via Java's reflective capabilities. I don't think anything equivalent exists either at a C macro level or programmatically (e.g., I know there's nothing for inspecting a function pointer and getting argument information), so any information we want to present will have to be provided manually.

I have also considered expanding our annotations to mark up specific parameters as coerced (generating coercion code or type errors automatically) and to include richer information about variable arity call paths, keyword args, and so on. Potentially, we'd be able to mark up a normal Java method sorta like this:

public IRubyObject some_impl(IRubyObject @required arg0, IRubyObject @keyword foo, IRubyObject @keyword bar) ...

...and automatically pass keyword args into the "foo" and "bar" variables with no intermediate Hash. Lots of potential here.

#6 Updated by Charles Nutter about 1 year ago

marcandre: Have you had a chance to prototype what you were talking about? I'll be on #jruby Freenode IRC today if you want to chat about a prototype impl.

#7 Updated by Yorick Peterse 9 months ago

As a follow up, a while ago I resolved a similar issue in Rubinius. Although
Rubinius would provide correct argument types it would consider all local
variables in a method as block arguments. As a result you'd quickly end up with
methods with dozens of parameters while in reaily they only had a few.

At this point Rubinius is the only implementation that I am aware of that
provides accurate results when using UnboundMethod#parameters. It would be
great for the other implementations to also properly address this issue.

#8 Updated by Charles Nutter 9 months ago

What do you mean "accurate results when using UnboundMethod#parameters"?

$ bin/jruby -e "def foo(a, b=1, *c, &d); end; p self.class.instance_method(:foo).parameters"
[[:req, :a], [:opt, :b], [:rest, :c], [:block, :d]]

$ ruby2.0.0 -e "def foo(a, b=1, *c, &d); end; p self.class.instance_method(:foo).parameters"
[[:req, :a], [:opt, :b], [:rest, :c], [:block, :d]]

#9 Updated by Yorick Peterse 9 months ago

Consider the following code:

  def example(required, optional = 10)
  end

  method(:example).parameters

On all Ruby implementations this works as expected and results in the
following:

  [[:req, :required], [:opt, :optional]]

The problem, at least with MRI, is that the moment you do something
similar with methods that are written in C all meaningful information is
lost:

  String.instance_method(:gsub).parameters # => [[:rest]]

This is false since gsub has at least 1 required argument. This happens
with a lot of methods (if not all) in MRI that are implemented in C.
Jruby is also affected by this (at least with the above example).
Rubinius is thus far the only implementation that gets this right that I
know of.

In hindsight, I probably should've made the above clear from the start.

Yorick

#10 Updated by Charles Nutter 9 months ago

On Wed, Jul 10, 2013 at 11:16 AM, Yorick Peterse
yorickpeterse@gmail.com wrote:

The problem, at least with MRI, is that the moment you do something
similar with methods that are written in C all meaningful information is
lost:

String.instance_method(:gsub).parameters # => [[:rest]]

This is false since gsub has at least 1 required argument. This happens
with a lot of methods (if not all) in MRI that are implemented in C.
Jruby is also affected by this (at least with the above example).
Rubinius is thus far the only implementation that gets this right that I
know of.

Yes, that is the reason I filed this feature. :-)

Rubinius diverges from everyone else here and presents its own
argument list for #parameters rather than presenting the same
information as MRI. I would like to see #parameters reflect meaningful
information for even native methods, but the JRuby policy is to not
unilaterally make such decisions.

We could (and at one point, did) present the same data as Rubinius,
but up to now we have chosen to match MRI.

  • Charlie

#11 Updated by Charles Nutter 9 months ago

FWIW, here's output from and patch to enable "rich" parameter
information on JRuby. We do not provide the argument names because
JRuby often implements multiple-arity methods with multiple native
code bodies (so there's potentially different argument names for each
arity). I would not expect to require parameter names for native
methods, since there will be many ways to implement native methods and
some may be incompatible with a single variable name for each
position.

The variable names are not very useful anyway...I think most people
will be interested in the parameter types.

system ~/projects/jruby $ jruby -e "p String.instance_method(:gsub).parameters"
[[:req], [:opt]]

system ~/projects/jruby $ jruby -e "p Array.instance_method(:[]=).parameters"
[[:req], [:req], [:opt]]

system ~/projects/jruby $ git diff
diff --git a/core/src/main/java/org/jruby/internal/runtime/methods/InvocationMethodFactory.java
b/core/src/main/java/org/jruby/internal/runtime/methods/InvocationMethodFactory.java
index dc45ae9..014f09a 100644
--- a/core/src/main/java/org/jruby/internal/runtime/methods/InvocationMethodFactory.java
+++ b/core/src/main/java/org/jruby/internal/runtime/methods/InvocationMethodFactory.java
@@ -602,7 +602,7 @@ public class InvocationMethodFactory extends
MethodFactory implements Opcodes {
private boolean block;
private String parameterDesc;

  • private static final boolean RICHNATIVEMETHOD_PARAMETERS = false;
  •    private static final boolean RICH_NATIVE_METHOD_PARAMETERS = true;
    
      public DescriptorInfo(List<JavaMethodDescriptor> descs) {
          min = Integer.MAX_VALUE;
    

    On Wed, Jul 10, 2013 at 1:32 PM, Charles Oliver Nutter
    headius@headius.com wrote:

    On Wed, Jul 10, 2013 at 11:16 AM, Yorick Peterse
    yorickpeterse@gmail.com wrote:

    The problem, at least with MRI, is that the moment you do something
    similar with methods that are written in C all meaningful information is
    lost:

    String.instance_method(:gsub).parameters # => [[:rest]]
    

    This is false since gsub has at least 1 required argument. This happens
    with a lot of methods (if not all) in MRI that are implemented in C.
    Jruby is also affected by this (at least with the above example).
    Rubinius is thus far the only implementation that gets this right that I
    know of.

    Yes, that is the reason I filed this feature. :-)

    Rubinius diverges from everyone else here and presents its own
    argument list for #parameters rather than presenting the same
    information as MRI. I would like to see #parameters reflect meaningful
    information for even native methods, but the JRuby policy is to not
    unilaterally make such decisions.

    We could (and at one point, did) present the same data as Rubinius,
    but up to now we have chosen to match MRI.

    • Charlie

#12 Updated by Yorick Peterse 9 months ago

Actually I personally do have a use case for the argument names being
available, but I wouldn't be too surprised if I was one of the few ones
that actually needed it. In my case I have some code that builds
definitions of Ruby methods and such, including the argument types and
names. An example of the result of this can be seen here:
http://git.io/YGex6A

Having said that, we seem to mostly agree so I'll try to not deviate
from the subject any further :)

Yorick

#13 Updated by Charles Nutter 7 months ago

Any possibility of getting this in for 2.1?

Also available in: Atom PDF