Bug #16841

Some syntax errors are thrown from compile.c

Added by ibylich (Ilya Bylich) 3 months ago. Updated about 2 months ago.

Target version:
ruby -v:
ruby 2.7.1p83 (2020-03-31 revision a0c7c23c9c) [x86_64-darwin19]


compile.c has a few places where it raises SyntaxError. Because of that ruby -c, Ripper and RubyVM::AbstractSyntaxTree don't catch them:

> ruby -vce 'class X; break; end'
ruby 2.7.1p83 (2020-03-31 revision a0c7c23c9c) [x86_64-darwin19]
Syntax OK
2.7.1 :001 > require 'ripper'
 => false
2.7.1 :002 > Ripper.sexp('class X; break; end')
 => [:program, [[:class, [:const_ref, [:@const, "X", [1, 6]]], nil, [:bodystmt, [[:void_stmt], [:break, []]], nil, nil, nil]]]]
2.7.1 :003 > RubyVM::AbstractSyntaxTree.parse('class X; break; end')
 => #<RubyVM::AbstractSyntaxTree::Node:SCOPE@1:0-1:19>

I've changed locally assert_valid_syntax to use RubyVM::AbstractSyntaxTree for parsing and got ~5 failing tests (like Invalid next/break/redo and one more related to pattern matching).

I started playing with parse.y yesterday but then I quickly realized that to reject such code we need some information about scopes (basically something like a stack of scopes).
This way we could reject break if we are not directly in block/lambda/loop.
But then I realized that we can't properly collect stack elements (by doing something like scopes.push(<scope name>)) for post-loops:

break while true

because the rule is

| stmt modifier_while expr_value

and adding something like { push_context(p, IN_LOOP) } in front of it causes a ton of shift/reduce conflicts (which makes a lot of sense). Is it the reason why these cases are rejected during compilation?

If so, is there any simple way to reject it in the grammar? Maybe some kind of the AST post-processor? But then I guess we need a separate version for Ripper, right?

Also available in: Atom PDF