Feature #18943
closedNew constant caching instruction: opt_getconstant_path
Description
I'd like to propose the change to the bytecode for constant caching.
I've submitted this improvement via pull request at https://github.com/ruby/ruby/pull/6187 and also attached a patch to this issue.
Previously YARV bytecode implemented constant caching by having a pair of instructions, opt_getinlinecache
and opt_setinlinecache
, wrapping a series of getconstant
calls (with putobject
providing supporting arguments).
# old
$ ruby --dump=insns -e 'Foo::Bar::Baz'
== disasm: #<ISeq:<main>@-e:1 (1,0)-(1,13)> (catch: FALSE)
0000 opt_getinlinecache 17, <is:0> ( 1)[Li]
0003 putobject true
0005 getconstant :Foo
0007 putobject false
0009 getconstant :Bar
0011 putobject false
0013 getconstant :Baz
0015 opt_setinlinecache <is:0>
0017 leave
This commit replaces that pattern with a new instruction, opt_getconstant_path
, handling both getting/setting the inline cache and fetching the constant on a cache miss.
This is implemented by storing the full constant path as a null-terminated array of IDs inside of the IC structure. idNULL
is used to signal an absolute constant reference.
# new
$ ./miniruby --dump=insns -e '::Foo::Bar::Baz'
== disasm: #<ISeq:<main>@-e:1 (1,0)-(1,13)> (catch: FALSE)
0000 opt_getconstant_path <ic:0 ::Foo::Bar::Baz> ( 1)[Li]
0002 leave
The motivation for this is that we had increasingly found the need to disassemble the instructions between the opt_getinlinecache
and opt_setinlinecache
in order to determine the constant we are fetching, or otherwise store metadata.
This disassembly was previously done:
- In
opt_setinlinecache
, to register theIC
against the constant names it is using for granular invalidation. - In
rb_iseq_free
, to unregister the IC from the invalidation table. - In YJIT to find the position of a
opt_getinlinecache
instruction to invalidate it when the cache is populated - In YJIT to register the constant names being used for invalidation.
With this change we no longer need disassembly for these (in fact rb_iseq_each
is now unused and is removed in the PR), as the list of constant names being referenced is held in the IC
. This should also make it possible to make more optimizations in the future.
This may also reduce the size of iseqs, as previously each segment required 32 bytes (assuming 64-bit platform) for each constant segment. This implementation only stores one 8-byte ID
per-segment .
There should be no significant performance difference between this and the previous implementation. Previously opt_getinlinecache
was a "leaf" instruction, but it included a jump (almost always to a separate cache line). Now opt_getconstant_path
is a non-leaf (it may raise/autoload/call const_missing
) but it does not jump. These seem to even out. This also removes a field from the IC structure that was needed by YJIT, but adds the ID *segments
field, so the size remains the same.
Files