Project

General

Profile

Actions

Bug #18801

closed

Dead YARV instructions produced when `branchif` is used

Added by wildmaples (Maple Ong) about 1 month ago. Updated 17 days ago.

Status:
Rejected
Priority:
Normal
Assignee:
-
Target version:
-
[ruby-core:108669]

Description

Description

It seems there are unused YARV instructions produced when the snippet contains a branchif instruction.

In the following example, the instructions numbers 0002 to 0004 won't ever be executed:

irb(main):003:0> puts RubyVM::InstructionSequence.compile("while 2+3; puts 'hi'; end").disasm

== disasm: #<ISeq:<compiled>@<compiled>:1 (1,0)-(1,25)> (catch: FALSE)
0000 jump                                   12                        (   1)[Li]
0002 putnil
0003 pop
0004 jump                                   12
0006 putself
0007 putstring                              "hi"
0009 opt_send_without_block                 <calldata!mid:puts, argc:1, FCALL|ARGS_SIMPLE>
0011 pop
0012 putobject                              2
0014 putobject                              3
0016 opt_plus                               <calldata!mid:+, argc:1, ARGS_SIMPLE>[CcCr]
0018 branchif                               6
0020 putnil
0021 leave

Similarly in this example, 0006-0008 won't be executed.

irb(main):003:0> puts RubyVM::InstructionSequence.compile("x = 9; while x; puts 'hi'; end").disasm
== disasm: #<ISeq:<compiled>@<compiled>:1 (1,0)-(1,30)> (catch: FALSE)
local table (size: 1, argc: 0 [opts: 0, rest: -1, post: 0, block: -1, kw: -1@-1, kwrest: -1])
[ 1] x@0
0000 putobject                              9                         (   1)[Li]
0002 setlocal_WC_0                          x@0
0004 jump                                   16
0006 putnil
0007 pop
0008 jump                                   16
0010 putself
0011 putstring                              "hi"
0013 opt_send_without_block                 <calldata!mid:puts, argc:1, FCALL|ARGS_SIMPLE>
0015 pop
0016 getlocal_WC_0                          x@0
0018 branchif                               10
0020 putnil
0021 leave

Initially we thought those instructions (i.e. putnil, pop, jump) were used when the return value of the while-loop is needed.

irb(main):012:0> puts RubyVM::InstructionSequence.compile("x = while foo; 9; end").disasm
== disasm: #<ISeq:<compiled>@<compiled>:1 (1,0)-(1,21)> (catch: FALSE)
local table (size: 1, argc: 0 [opts: 0, rest: -1, post: 0, block: -1, kw: -1@-1, kwrest: -1])
[ 1] x@0
0000 jump                                   4                         (   1)[Li]
0002 putnil
0003 pop
0004 putself
0005 opt_send_without_block                 <calldata!mid:foo, argc:0, FCALL|VCALL|ARGS_SIMPLE>
0007 branchif                               4
0009 putnil
0010 dup
0011 setlocal_WC_0                          x@0
0013 leave

But it seems like some dead instructions (0002, 0003) in the example above still remains.

Are those instructions meant to be used for something else or is it a "bug" that it sticks around?
Perhaps it can be optimized away?

Updated by eightbitraptor (Matthew Valentine-House) about 1 month ago

wildmaples (Maple Ong) wrote:

It seems there are unused YARV instructions produced when the snippet contains a branchif instruction.

I'm not sure this is branchif related. It looks like while true; end generates the same dead instructions

❯ ruby --dump=insns -e 'while true; end'
== disasm: #<ISeq:<main>@-e:1 (1,0)-(1,15)> (catch: FALSE)
0000 jump                                   4                         (   1)[Li]
0002 putnil
0003 pop
0004 jump                                   4
0006 putnil
0007 leave

Updated by eightbitraptor (Matthew Valentine-House) about 1 month ago

I investigated these instructions and whether or not they were dead code and
could be removed.

From the outside it looks like there are two things to note:

  • they immediately follow an unconditional jump and so should theoretically
    never be reached, and
  • when executed together they are effectively a no-op (push nil onto the stack,
    and then remove it)

However, this isn't the whole picture. The summary is that they currently are
necessary for the correct operation of the next keyword and cannot be removed.

A jump target (next_catch_label) is inserted in between the putnil and the
pop. I did not investigate thorougly enough to fully understand how next
works and what the significance of the putnil is, however all of my naive
attempts to remove these instructions caused either memory errors, or test
failures when working with loops.

It's possible to view the instructions generated, with the labels still in place
using RubyVM::InstructionSequence directly:

❯ ruby -e "RubyVM::InstructionSequence.compile('(while rand < 0.9; class C; next 2; end; end)').to_a.last.each { |e| puts e.inspect }"
1
:RUBY_EVENT_LINE
[:jump, :label_14]
[:putnil]
:label_3
[:pop]
[:jump, :label_14]

The instructions in question are generated by the compile_loop function in
compile.c, which is executed when a NODE_UNTIL or NODE_WHILE tree node is
compiled.

    ADD_LABEL(ret, adjust_label);
    ADD_INSN(ret, line_node, putnil);
    ADD_LABEL(ret, next_catch_label);
    ADD_INSN(ret, line_node, pop);
    ADD_INSNL(ret, line_node, jump, next_label);

If we comment out the ADD_INSN lines for putnil and pop, leaving in the
jump targets next_catch_label and next_label, then one of the flow control
tests in bootstraptest/test_flow.rb will fail. That test looks like this:

assert_equal %q{4}, %q{
  def m a, b
    a + b
  end
  m(1,
    (i=0; while i<2
       i+=1
       class C
         next 2
       end
     end; 3)
    )
}

Without these instructions this code outputs 5 instead of 4 causing an
assertion failure.

In addition, the following will cause a Segmentation Fault with miniruby, when
those instructions are not present.

./miniruby -e "while rand < 0.9; class C; next 2; end; end"

Interestingly, removing the next keyword from our test code snippet results in
different bytecode being generated by the compile_loop function: The
next_catch_label only appears when the next keyword is used, despite the
code that inserts it not being obviously conditional

With next:

❯ ruby -e "RubyVM::InstructionSequence.compile('(while rand < 0.9; class C; next 2; end; end)').to_a.last.each { |e| puts e.inspect }"
1
:RUBY_EVENT_LINE
[:jump, :label_14]
[:putnil]
:label_3
[:pop]
[:jump, :label_14]

Without next:

❯ ruby -e "RubyVM::InstructionSequence.compile('(while rand < 0.9; class C; end; end)').to_a.last.each { |e| puts e.inspect }"
1
:RUBY_EVENT_LINE
[:jump, :label_14]
[:putnil]
[:pop]
[:jump, :label_14]

The questions I have are:

  • How does that test output 5?
  • Even given the presence of the next_catch_label, the putnil instruction looks
    like it can never be reached. So why does removing it result in a Stack
    Underflow?
  • How does the conditional insertion of the next_catch_label work?
  • What is the significance of adjust_label (inserted before the putnil), and
    why is it never visible in the instruction sequence?

Updated by eightbitraptor (Matthew Valentine-House) about 1 month ago

My questions aside. I'm not sure this is a bug, and could probably be closed?

Updated by dylants (Dylan Thacker-Smith) about 1 month ago

Both of those instructions serve a purpose, but they could also be optimized away.

next_catch_label is referenced from the catch table. That catch table may be optimized away, which is why you are seeing the label being omitted from the disassembly when next is removed. Similarly, the pop that follows that label could also be optimized away when the catch table is optimized away, otherwise, it is needed to pop off the value added by the throw instruction for a non-local next usage (e.g. that test was using it within a class definition block rather than locally within the method).

The putnil instruction that precedes next_catch_label seems to only be used to adjust the stack depth for that label, to account for the value thrown by next. The instruction isn't needed for execution, so it could be replaced by an iseq element that adds an element to the stack without being compiled to an instruction.

Note that adjust_label is an example of how the stack depth at a label gets used, where it gets used by the the ADD_ADJUST_RESTORE macro to create an ISEQ_ELEMENT_ADJUST. The ISEQ_ELEMENT_ADJUST also doesn't end up generating any instructions, since the compiler calculates the necessary adjustment to be 0. I don't think the label would ever end up in the disassembly, since it doesn't look like it is used as a jump target.

Updated by mame (Yusuke Endoh) 17 days ago

  • Status changed from Open to Rejected

I believe this is not a bug. The current YARV compiler is a product of trial and error, and we know that sometimes it emits dead code (mainly due to peephole optimization). It may be room for improvement, but not a bug.

Actions

Also available in: Atom PDF