Misc #16956
Updated by k0kubun (Takashi Kokubun) over 3 years ago
# What's this ticket?
A text explaining what attributes MJIT uses for optimizations and why they're needed.
This is written here in case people want to comment on this in a way that can notify me.
# Current state
## The per-insn attribute we currently have and MJIT uses
```
* leaf: indicates that the instruction is "leaf" i.e. it does
not introduce new stack frame on top of it.
If an instruction handles sp, that can never be a leaf.
```
## MJIT's optimizations which rely on leaf
* PC motion skip
* If leaf, an insn doesn't see cfp->pc because an exception is not thrown and an arbitrary method which may see lineno isn't called. This place also checks catch_except_p=false to make sure catch table of this iseq is not used.
* https://github.com/ruby/ruby/blob/daea41c3df0d63eda553c92c0ca29eaceb6d5828/tool/ruby_vm/views/_mjit_compile_pc_and_sp.erb#L10-L13
* Deoptimization check skip
* If leaf, an arbitrary method which may invalidate (i.e. TracePoint or GC.compact) the current code may not be called.
* https://github.com/ruby/ruby/blob/daea41c3df0d63eda553c92c0ca29eaceb6d5828/tool/ruby_vm/views/_mjit_compile_insn.erb#L61-L71
* Frame push omission on method inlining
* If leaf, any cfp won't be pushed to the stack. Thus pushing a cfp (which could be a base of what'd be pushed if it were not leaf) for an all-leaf inlined method can be skipped. Also we need the same guarantee as "PC motion skip" because cfp->pc (and even existence of a frame, when it's about callstack) won't be maintained at all.
* https://github.com/ruby/ruby/blob/daea41c3df0d63eda553c92c0ca29eaceb6d5828/mjit_compile.c#L377-L378
## Fine-grained speculations
* (1) leaf: A cfp may not be pushed to the stack
* (2) An arbitrary method may not be called
* Obviously this is guaranteed by (1).
* (3) An exception may not be thrown
* MJIT assumes this from (2). This is legitimate because rb_raise calls rb_exc_new3 => rb_class_new_instance => rb_obj_call_init_kw => rb_funcallv_kw.
* (4) `mjit_call_p = false` may not be set
* a.k.a. JIT cancel-all. It's set by TracePoint and GC.compact. Therefore assuming it from (2) should be fair.
* (5) cfp->pc may not be read
* Aside from insn dispatch and catch table, cfp->pc is only read by a C method showing lineno of a callstack or calling C API like `rb_profile_frames`. If this assumption is true, we can assume this from (2).
### leaf vs attr inline
We introduced `Primitive.attr!` at https://github.com/ruby/ruby/pull/3244. Why isn't it called `leaf`? This is because:
* For "frame push omission on method inlining", MJIT currently depends on (1), (2), (3), and (5).
* As said above, (2) and (3) can be assumed from (1). However, assuming (5) from (1) in arbitrary builtin C functions is questionable, unlike VM insns.
* Thus `Primitive.attr!` declares it satisfies (1) and (5). But we've implemented verification only for (1) (see: https://github.com/ruby/ruby/pull/3244).
# Discussions
## Primitive.attr!
Currently there's no easy way to know behaviors of a C method. We may want to annotate a C method to provide information like what's described above.
Apparently [Feature #16254] has this problem ("Annotation issues") in one of its motivations. And therefore converting a C method to a builtin method and annotating the method would be the most legitimate way we can foresee.
When we think about annotating a method with builtin insn, there can be two ways to satisfy MJIT's immediate need:
1. Per-method attribute
2. Per-insn attribute for builtin insns
For now I'm trying to add an attribute so that an iseq with builtin insn can be analyzed to be side-effect free. 1 is a direct representation of it. If 2 can be used to assume a builtin insn is leaf=true, a method (possibly with other non-builtin insns) can be analyzed as leaf=true using other insn's leaf attribute. When we convert a C method to a single builtin insn, both 1 and 2 work totally fine.
Since ko1 preferred 1 for simplicity, I'm thinking about having per-method attribute annotation (`Primitive.attr!`). I'm still thinking about what attributes should be annotated, but it's gonna be something required for "Frame push omission".