Project

General

Profile

Bug #1240

parser bug in 1.8.7 and 1.9.1p0

Added by thomer (Thomer Gil) over 10 years ago. Updated about 8 years ago.

Status:
Closed
Priority:
Normal
Target version:
-
ruby -v:
ruby 1.9.1p0 (2009-01-30 revision 21907) [i686-linux]
[ruby-core:22637]

Description

=begin
# ruby parser accepts first line, not the second.
x y { "#{}".z { } }
x y { "#{}".z do end }
=end

History

#1

Updated by zenspider (Ryan Davis) over 10 years ago

=begin

On Mar 2, 2009, at 22:30 , Thomer Gil wrote:

Bug #1240: parser bug in 1.8.7 and 1.9.1p0
http://redmine.ruby-lang.org/issues/show/1240

Author: Thomer Gil
Status: Open, Priority: Normal
Category: core
ruby -v: ruby 1.9.1p0 (2009-01-30 revision 21907) [i686-linux]

ruby parser accepts first line, not the second.

x y { "#{}".z { } }
x y { "#{}".z do end }

1.8.6 as well... this also appears to be the minimimal reproduction.

Remove anything and it parses. How confounding.

If/When this gets fixed, please respond with the patch or commit # so

I can track it. Thanks.

=end

#2

Updated by shyouhei (Shyouhei Urabe) over 10 years ago

  • Assignee set to matz (Yukihiro Matsumoto)

=begin

=end

#3

Updated by daz (Dave B) over 10 years ago

=begin
As there's no context for this "bug", I've tried to provide some.

The parser takes different routes depending on what has been seen,
so context is important.

def x(args=nil)
p [:x_args, args]
block_given? and (puts '--> x blk'; yield)
end

def y(args=nil)
p [:y_args, args]
block_given? and (puts '--> y blk'; yield)
end

class String
def z(args=nil)
p [:z_args, args]
block_given? and (puts '--> z blk'; yield)
end
end

x y { "#{}".z { "GIGO\n".display } }
# x y { "#{}".z do "GIGO\n".display end } # parser fail

# Contrasting with ...
p "#{}".z { "GIGO\n".display }
puts '---'
p "#{}".z do "GIGO\n".display end
# ... what was "do end" in the original report
# expected to bind to?

daz

=end

#4

Updated by zenspider (Ryan Davis) over 10 years ago

=begin

On Mar 3, 2009, at 13:42 , Dave B wrote:

As there's no context for this "bug", I've tried to provide some.

This isn't a "bug", it is a bug, as I think I've shown below.

Contrasting with ...

p "#{}".z { "GIGO\n".display }
puts '---'
p "#{}".z do "GIGO\n".display end

... what was "do end" in the original report

expected to bind to?

I find this view more clear:

% echo 'p "#{}".z do "GIGO\n".display end' | parse_tree_show
s(:iter,
s(:call,
nil,
:p,
s(:arglist, s(:call, s(:dstr, "", s(:evstr)), :z, s(:arglist)))),
nil,
s(:call, s(:str, "GIGO\n"), :display, s(:arglist)))

so the do/end (aka iter) is bound to the call to #p.

From the original report, minus the empty interpolation:

% echo 'x y { "".z do end }' | parse_tree_show
s(:call,
nil,
:x,
s(:arglist,
s(:iter,
s(:call, nil, :y, s(:arglist)),
nil,
s(:iter, s(:call, s(:str, ""), :z, s(:arglist)), nil)))

the interpolation would replace the s(:str, "") with:

% echo '"#{}"' | parse_tree_show
s(:dstr, "", s(:evstr))

So really there isn't any reason why this shouldn't be parseable. If

this isn't solved by the weekend I'll take a whack at it with

ruby_parser and translate/backport from there. I hate these

codepaths tho and I haven't gotten around to rewriting the string

stack yet, so it isn't fun.

=end

#5

Updated by mame (Yusuke Endoh) over 9 years ago

  • Assignee changed from matz (Yukihiro Matsumoto) to mame (Yusuke Endoh)

=begin
Hi,

2009/3/3 Thomer Gil redmine@ruby-lang.org:

ruby parser accepts first line, not the second.

x y { "#{}".z { } }
x y { "#{}".z do end }

Interesting. I found similar bugs:

while ("#{}".foo { } ); end # ok
while ("#{}".foo do end); end # parse error

while ->{ 1.times { } }.call; done # ok
while ->{ 1.times do end }.call; done # parse error

The cause is mishandling of cond_stack and cmdarg_stack.
When "#{}" is parsed, the parser pushes a value into these stacks once
(at #{ (tSTRING_DBEG)), and pops twice (at '}' and action of the rule
`string_content'.)

In addition, the parser forgets to push at ->{ (tLAMBEG).

I think it is better for not parser but lexer to push into the stacks
at tSTRING_DBEG. See my patch.

BTW, I also noticed that block call with do' keyword does not work in
until' condition:

until begin 1.times { } end do end # ok
until begin 1.times do end end do end # parse error
~~
until if true then 1.times { } end do end # ok
until if true then 1.times do end end do end # parse error
~~
until until true do 1.times { } end do end # ok
until until true do 1.times do end end do end # parse error
~~
until class Foo; 1.times { } ; end do end # ok
until class Foo; 1.times do end; end do end # parse error
~~
until case; when true; 1.times { } ; end do end # ok
until case; when true; 1.times do end; end do end # parse error
~~

This is because the underlined do's are not considered as block call
but beginning of
until' body.

Although this is confusing a little and can be actually fixed, the fix
needs many COND_PUSH(0)/COND_POP(), which may decrease performance and
code maintenability. In addition, writing such a long and complex
condition directly is absolutely bad (even insane) style.
So, we should accept the above behaviors as spec, I think.

Here is a patch to fix the issue Thomer reported.
I'll commit it unless there is objection.

diff --git a/parse.y b/parse.y
index 340a825..f234ae3 100644
--- a/parse.y
+++ b/parse.y
@@ -4033,14 +4033,10 @@ string_content : tSTRING_CONTENT
$$ = lex_strterm;
lex_strterm = 0;
lex_state = EXPR_BEG;

  • COND_PUSH(0);
  • CMDARG_PUSH(0); } compstmt '}' { lex_strterm = $2;
  • COND_LEXPOP();
  • CMDARG_LEXPOP(); /%%%/ if ($3) $3->flags &= ~NODE_FL_NEWLINE; $$ = new_evstr($3); @@ -5873,6 +5869,8 @@ parser_parse_string(struct parser_params *parser, NODE *quote) pushback(c); return tSTRING_DVAR; case '{':
  • COND_PUSH(0);
  • CMDARG_PUSH(0); return tSTRING_DBEG; } tokadd('#'); @@ -6070,6 +6068,8 @@ parser_here_document(struct parser_params *parser, NODE *here) pushback(c); return tSTRING_DVAR; case '{':
  • COND_PUSH(0);
  • CMDARG_PUSH(0); return tSTRING_DBEG; } tokadd('#'); @@ -7314,6 +7314,8 @@ parser_yylex(struct parser_params *parser) lex_state = EXPR_BEG; lpar_beg = 0; --paren_nest;
  • COND_PUSH(0);
  • CMDARG_PUSH(0); return tLAMBEG; } if (IS_ARG() || lex_state == EXPR_END)

--
Yusuke ENDOH mame@tsg.ne.jp
=end

#6

Updated by mame (Yusuke Endoh) over 9 years ago

=begin
Hi,

2010/4/8 Caleb Clausen vikkous@gmail.com:

Yusuke Endoh wrote:

BTW, I also noticed that block call with do' keyword does not work in
until' condition:

?until begin 1.times { } ? ?end do end # ok
?until begin 1.times do end end do end # parse error
? ? ? ? ? ? ? ? ? ? ?~~
[snip... more like that]

This is because the underlined do's are not considered as block call
but beginning of
until' body.

Although this is confusing a little and can be actually fixed, the fix
needs many COND_PUSH(0)/COND_POP(), which may decrease performance and
code maintenability. ?In addition, writing such a long and complex
condition directly is absolutely bad (even insane) style.
So, we should accept the above behaviors as spec, I think.

Wait.... please don't harden any bugs into the specification of the
language. If you want to say that it's too obscure/too hard to fix,
that's fine. WONTFIX is ok, under the circumstances. If you want to
say it's implementation dependent behavior, that's at least
acceptable. But let's not pretend that what's clearly a bug is
intended behavior. (Best of all would be to fix the bug, but I'm not
really asking for that.)

You are right. I guess it be considered as WONTFIX.

My own lexer & parser were able to parse all the examples in this bug
report correctly and I did not need to change anything.

Great.

BTW, how about the following? :-)

# ok
while (((((((((((((((((((((((((((((((false)))))))))))))))))))))))))))))))
do; end

# parse error
while ((((((((((((((((((((((((((((((((false))))))))))))))))))))))))))))))))
do; end

If your CPU is 64 bit, please run after 32 '(' and ')' are added :-p

IMHO, do_cond (omittable `do' keyword in the part of while/until/for
constructs) brings confusion but little benefit to the Ruby syntax.
How about removal of do_cond in 2.0?

--
Yusuke ENDOH mame@tsg.ne.jp

=end

#7

Updated by coatl (caleb clausen) over 9 years ago

=begin
On 4/8/10, Yusuke ENDOH mame@tsg.ne.jp wrote:

2010/4/8 Caleb Clausen vikkous@gmail.com:

Wait.... please don't harden any bugs into the specification of the
language. If you want to say that it's too obscure/too hard to fix,
that's fine. WONTFIX is ok, under the circumstances. If you want to
say it's implementation dependent behavior, that's at least
acceptable. But let's not pretend that what's clearly a bug is
intended behavior. (Best of all would be to fix the bug, but I'm not
really asking for that.)

You are right. I guess it be considered as WONTFIX.

Thank you.

BTW, how about the following? :-)

ok

while (((((((((((((((((((((((((((((((false)))))))))))))))))))))))))))))))
do; end

parse error

while ((((((((((((((((((((((((((((((((false))))))))))))))))))))))))))))))))
do; end

Ooooo! Both of those don't parse right in redparse. But not for the reason you seem to expect; it's objecting to the do on its own line. That is to say, this also doesn't parse:

while false
do; end

Thanks for the bug. I'll fix it right away.

Taking out the newline before do makes them parse right again. (For MRI too, tho.)

Incidentally, all the ruby versions I have (1.8.6, 1.9.0, 1.9.1) also object to do after a newline. I guess this was added (relatively) recently?

If your CPU is 64 bit, please run after 32 '(' and ')' are added :-p

This was on a 32 bit cpu. I wouldn't anticipate any problems with any number of nested parentheses. (Until, that is, you run out of memory.)

IMHO, do_cond (omittable `do' keyword in the part of while/until/for
constructs) brings confusion but little benefit to the Ruby syntax.
How about removal of do_cond in 2.0?

Hear, hear! I agree wholeheartedly. (Unless there's someone out there who actually likes this? Speak now or forever hold your peace.)
=end

#8

Updated by shyouhei (Shyouhei Urabe) almost 9 years ago

  • Status changed from Open to Assigned

=begin

=end

Updated by mame (Yusuke Endoh) about 8 years ago

  • Status changed from Assigned to Closed

The original issue that OP reported was already fixed at r27359.
The commit caused a regression [ruby-core:29579], and nobu fixed
it at r27387.

So, closed. Thanks.

--
Yusuke Endoh mame@tsg.ne.jp

Also available in: Atom PDF