Bug #5002

Ripper fails to distinguish local vars from vcalls [PATCH]

Added by Michael Edgar almost 3 years ago. Updated almost 3 years ago.

[ruby-core:37908]
Status:Closed
Priority:Normal
Assignee:Nobuyoshi Nakada
Category:core
Target version:1.9.3
ruby -v:- Backport:

Description

Ripper always parses the variable grammar production (which includes identifiers, {i,c,g}vars, nil, FILE, etc) as a var_ref node, whose only child is the token itself.

This is a problem for one huge reason: local variables look exactly like vcalls: no-arg, no-receiver method calls. More importantly, the parse tree defines whether a given bareword identifier is a local variable reference or a method call. Thus, given a ripper parse tree, in order to distinguish local variable references from vcalls, one must reconstruct the parse order, re-implement the local variable introduction rules (local variable assigned in some way, for loops, block arg, rescue exception variable, named regex capture groups, ....), and then relabel those var_ref nodes which are method calls as vcall nodes.

This is quite a nasty workaround. There are a lot of edge cases to mess up. I've implemented it as the ripper-plus gem, but it's a huge pain, I'm not sure it's entirely correct, and is something the parser should be doing anyway.

The funny thing is, the parser already is doing almost all of the work! It's just not looking at the local variable tables when it comes time to generate the Ripper event. The patch I've attached does do so - it's a small change for a huge benefit for Ripper users.

I'd like to see this land in 1.9.3 - it's a small patch, and given the other bug fixes Ripper's had this cycle, would make Ripper pretty much sufficient for an entire Ruby implementation.

ripper.vcall.diff Magnifier (1.64 KB) Michael Edgar, 07/09/2011 12:24 PM

vcall.breaking.diff Magnifier - Implementation which changes error messages for `nil = foo` (3.51 KB) Michael Edgar, 07/10/2011 04:55 AM

vcall.same_errors.diff Magnifier - Implementation with additional grammar rules to match existing error messages (4.24 KB) Michael Edgar, 07/10/2011 04:55 AM

Associated revisions

Revision 32498
Added by Nobuyoshi Nakada almost 3 years ago

  • parse.y (var_ref): distinguish vcall from local variable references. based on a patch by Michael Edgar michael.j.edgar AT dartmouth.edu. Bug #5002

History

#1 Updated by Nobuyoshi Nakada almost 3 years ago

  • ruby -v changed from ruby 1.9.3dev (2011-07-09 trunk 32466) [x86_64-darwin10.8.0] to -

Hi,

At Sat, 9 Jul 2011 12:24:04 +0900,
Michael Edgar wrote in :

The funny thing is, the parser already is doing almost all
of the work! It's just not looking at the local variable
tables when it comes time to generate the Ripper event. The
patch I've attached does do so - it's a small change for a
huge benefit for Ripper users.

I don't think 'self', 'nil', 'true', and so on are vcall. Of
course they are neither really variables, but a kind of it
syntactically. Reviewing from this point, your patch seems
wrong about use of get_id(), and I suspect it might need to
split the "variable" rule.

--
Nobu Nakada

#2 Updated by Michael Edgar almost 3 years ago

Ack - I missed how nil/self would be caught as vcalls there.

As you note, splitting the variable node is necessary. I split 'variable' into 'uservariable' and 'keywordvariable', removing 'variable' entirely (since it would give r/r conflicts). However, this turns out to be a good thing overall: every other use of the variable production is on the LHS (or LHS-like constructs, like rescue Foo => exc), which bar the use of keywords anyway through manual checking.

So, I have two patches: one which just replaces all other uses of variable with user_variable. This is "vcall.breaking.diff", as it breaks some previous error-case behavior. The allowed syntax is identical; the error reported on invalid syntax (self = 'foo'), is not. It will, in this patch, simply emit a generic bison "unexpected ..." error, as the grammar will not match such constructs.

The second patch adds back in keyword_variable productions, so that error reporting will be the same, even though the new productions will always generate errors. This is "vcall.same_errors.diff". For a patch release, I imagine maintaining the errors may be preferable, so I've included it for completeness.

#3 Updated by Nobuyoshi Nakada almost 3 years ago

Hi,

At Sun, 10 Jul 2011 04:55:07 +0900,
Michael Edgar wrote in :

As you note, splitting the variable node is necessary. I
split 'variable' into 'uservariable' and 'keywordvariable',
removing 'variable' entirely (since it would give r/r
conflicts). However, this turns out to be a good thing
overall: every other use of the variable production is on
the LHS (or LHS-like constructs, like rescue Foo

#4 Updated by Motohiro KOSAKI almost 3 years ago

  • Category set to core
  • Status changed from Open to Assigned
  • Assignee set to Nobuyoshi Nakada

#5 Updated by Nobuyoshi Nakada almost 3 years ago

  • Status changed from Assigned to Closed
  • % Done changed from 0 to 100

This issue was solved with changeset r32498.
Michael, thank you for reporting this issue.
Your contribution to Ruby is greatly appreciated.
May Ruby be with you.


  • parse.y (var_ref): distinguish vcall from local variable references. based on a patch by Michael Edgar michael.j.edgar AT dartmouth.edu. Bug #5002

Also available in: Atom PDF