Bug #21669
openThoroughly implement void value expression check
Description
A void-value-expression check is a syntax check that raises a SyntaxError if an expression that cannot grammatically return a value (a "void value expression," such as a return expression) appears in a context where a value is expected (e.g., the right-hand side of an assignment, the argument of a method call, etc.). A typical example rejected by this check is x = return.
However, it has become clear that this check is incomplete. This ticket summarizes the results of a discussion with @matz (Yukihiro Matsumoto) at the hackathon before RubyWorld Conference today.
1. Expressions containing branches¶
@matz (Yukihiro Matsumoto) said that an expression containing branches should be considered a void value expression only if it can be grammatically determined that all possible branches do not return a value.
Based on this, both parse.y and prism currently reject the following code with a "void value expression" SyntaxError, but it should be accepted:
x = begin
raise
return
rescue
"OK"
else
return
end
# Expected: Code parses and executes successfully.
# Actual: Rejected with "unexpected void value expression" SyntaxError.
Reason: The rescue clause can return a value ("OK"), so the entire begin expression is not void.
Conversely, the following code is currently accepted by both parse.y and prism, but it should be rejected:
x = begin
foo
rescue
return
else
return
end
# Expected: Rejected with SyntaxError.
# Actual: Code parses and executes.
Reason: This expression always does return, so it is void whether foo raises an exception or not.
Furthermore, case expressions must also be rejected similarly when all branches are void:
x =
case
when 1; return
when 2; return
else return
end
# Expected: Rejected with SyntaxError.
# Actual: Code parses and executes.
Note that if expressions appear to be implemented correctly as intended:
x = if rand < 0.5
return
else
return
end
# Expected: Rejected with SyntaxError.
# Actual: Rejected. (OK!)
2. Statement lists containing a void value expression¶
@matz (Yukihiro Matsumoto) also said that if a statements node contains a void value expression, the entire statement list should be treated as a void value expression, even if the void expression is not the last one.
Therefore, the following code is currently accepted but should be rejected:
x = begin
return
"NG"
end
# Expected: Rejected with SyntaxError.
# Actual: Code parses and executes.
The same applies to branches within other expressions:
x = if rand < 0.5
return
"NG"
else
return
end
# Expected: Rejected with SyntaxError.
# Actual: Code parses and executes.
Note that if a return statement is inside a branch, that statement cannot be considered a void value expression, so the entire expression also cannot be a void value expression.
x = begin
return if true
"NG"
end
# Expected: Code parses and executes.
# Actual: Code parses and executes as expected. (OK!)
This is considered a bug, but if it causes practical compatibility issues, it may be necessary to plan a migration path or revisit the specifications.
(Both prism and parse.y need to be fixed, but I'll assign it to the prism team for now.)
Updated by Eregon (Benoit Daloze) about 8 hours ago
This check is getting more and more complicated and I wonder of the value of having it.
In my opinion there is little value of have this check, so given the complexity and the cost to check it (needs extra traversals of potentially large parts of the AST) I would just remove the check entirely.
Has this check ever been useful to anyone?
Updated by Eregon (Benoit Daloze) about 8 hours ago
The whole concept of "void value expression" is also weird in Ruby, where every expression/statement has a value, so I think it feels like something that doesn't belong to Ruby.
It probably only triggers for pretty broken code, and even then not emitting the error or warning would result in correct execution, just some dead code (the assignment) would always be not executed.
Dead code can't be detected easily in Ruby, so why bother for such a narrow case that probably no one runs into?
Updated by Earlopain (Earlopain _) about 8 hours ago
ยท Edited
In prism, that whole check fits in about 130 lines: https://github.com/ruby/prism/blob/2ecd49dfeea1d0b00c3af609be82dd488d25c7f3/src/prism.c#L1027-L1166. Looking at that function, probably it would not be very difficult for prism to support this new behaviour.
Not so much complexity, but the runtime cost is there of course (I haven't measured it, I just know it's not zero).
I do believe this check has value. Consider return 1 and 2, which it currently rejects and also highlights that because of and precedence this basically becomes (return 1) and 2. That said, the cases here are certainly contrived.
Updated by mame (Yusuke Endoh) about 7 hours ago
Eregon (Benoit Daloze) wrote in #note-1:
This check is getting more and more complicated and I wonder of the value of having it.
IMO, the current is indeed complicated because it is inconsistent. I think the new behavior is consistent and thus simpler. At least it's explainable. I am a bit concerned about compatibility, though.
Earlopain (Earlopain _) wrote in #note-3:
In prism, that whole check fits in about 130 lines
As an aside, I noticed this inconsistency when Coverity Scan, a static analyzer for C, pointed out that the following branch in Prism was dead code:
https://github.com/ruby/prism/blob/2ecd49dfeea1d0b00c3af609be82dd488d25c7f3/src/prism.c#L1075-L1077
At line 1067, void_node = vn is executed (and vn is guaranteed to be non-NULL on the line above), so it's impossible for void_node to be NULL at line 1075. Thus, it is indeed dead code.
To remove the dead code, I started writing some test codes, and found these various inconsistencies. Incidentally, it wasn't a bug in Prism, but rather the result of perfectly replicating the inconsistent behavior in parse.y.