Feature #13434


better method definition in C API

Added by normalperson (Eric Wong) over 4 years ago. Updated over 4 years ago.

Target version:


Current ways to define and parse arguments in the Ruby C API are clumsy,
slow, and impede potential optimizations.

The current C API for defining (rb_define_{singleton_}, method),
and parsing (rb_scan_args, rb_get_kwargs) is orthogonal but inefficient.

rb_get_kwargs creates garbage which pure Ruby kwarg methods do not.
[Feature #11339] was an ugly workaround to use Ruby wrapper methods
for IO#*nonblock methods to avoid garbage from rb_get_kwargs.

Furthermore, it should be possible to annotate args for C functions as
"read-only, use-once" or similar. In other words, it should be possible to
implement my idea from [ruby-core:80626] where method lookup can be done
out-of-order in some cases, and allow optimizations such as replacing
"putstring" insns with garbage-free "putobject" insns for constants strings
without introducing backwards incompatibility for Rubyists.

We can also get rid of the limited basic op redefinition checks and
implement more generic versions of opt_aref_with / opt_aset_with
for more functions that can take frozen string args.

The "read-only, use-once" annotation can even make it safe for
a dynamic strings to be immediately recycled to reduce garbage.

So we could annotate "puts" and IO#write in a way that causes the VM to
immediately recycle its argument if it's a dynamically-generated string:

puts "#{dynamic} #{string(:here)}"

I am not good at API design; so I'm not sure what it should look like.

Perhaps sendmsg_nonblock may be implemented like:

struct rb_method_info {
    /* to be filled in by rb_def_method ... */

static VALUE
sendmsg_nonblock(struct rb_method_info *info, int argc, VALUE *argv, VALUE self)
    VALUE mesg, flags, dest_sockaddr, control, exception;

    rb_get_args(info, argc, argv,
        &mesg, &flags, &dest_sockaddr, &control, &exception);


 * ALLCAPS variable names mean read-only (like "constants" in Ruby)
 * "1" prefix means use only once, eligible for immediately recycle
 * if dynamic string

rb_def_method(rb_cBasickSocket, sendmsg_nonblock,
              "sendmsg_nonblock(1MESG "
                "1FLAGS = 0), "
                "1DEST_SOCKADDR = nil), "
                "*1CONTROL, exception: true)", -1);

/* rb_hash_aset can be done as:
 * where 0KEY (not "1" prefix) means it is constant and persistent,
 * and "val" (all lower case, no prefix) means it is a normal
 * variable which can persistent after the function returns
rb_def_method(rb_Hash, rb_hash_aset, "[0KEY]=val", 2);


The existing C API must continue to work, so 3rd-party extensions can
migrate to the new API slowly.


Also available in: Atom PDF