Feature #17276: Ripper stops tokenizing after keyword as a method parameter - Ruby - Ruby Issue Tracking System

Actions

Copy link

Feature #17276

closed

Ripper stops tokenizing after keyword as a method parameter

Feature #17276: Ripper stops tokenizing after keyword as a method parameter

Added by no6v (Nobuhiro IMAI) about 5 years ago. Updated almost 5 years ago.

Status:

Closed

Assignee:

Target version:

[ruby-core:100470]

Description

Although these are obviously syntax errors at this time, the following
codes cannot be tokenized correctly by Ripper.tokenize.

$ cat src.rb
def req(true) end
def opt(true=0) end
def rest(*true) end
def keyrest(**true) end
def block(&true) end
->true{}
->true=0{}
->*true{}
->**true{}
->&true{}
$ ruby -rripper -vlne 'p Ripper.tokenize($_)' src.rb
ruby 3.0.0dev (2020-10-21T00:24:47Z master da25affdac) [x86_64-linux]
["def", " ", "req", "(", "true", ")"]
["def", " ", "opt", "(", "true", "=", "0", ")"]
["def", " ", "rest", "(", "*", "true", ")"]
["def", " ", "keyrest", "(", "**", "true", ")"]
["def", " ", "block", "(", "&", "true", ")"]
["->", "true", "{"]
["->", "true", "=", "0", "{"]
["->", "*", "true", "{"]
["->", "**", "true", "{"]
["->", "&", "true", "{"]

end and } are not shown in result.

This seems to prevent irb from determining the continuity of the input.
See: https://github.com/ruby/irb/issues/38

Updated by jeremyevans0 (Jeremy Evans) almost 5 years ago Actions
Copy link
#1 [ruby-core:100885]

Tracker changed from Bug to Feature
ruby -v deleted (~~ruby 3.0.0dev (2020-10-21T00:24:47Z master da25affdac) [x86_64-linux]~~)
Backport deleted (~~2.5: UNKNOWN, 2.6: UNKNOWN, 2.7: UNKNOWN~~)

Ripper records errors, but Ripper.tokenize and Ripper.lex cannot return them. Here's how you can handle errors with Ripper (for tokenize, lex is similar):

require 'ripper'
r = Ripper::Lexer.new('def req(true) end', 'a', 1)
p r.tokenize
# => ["def", " ", "req", "(", "true", ")"]
p r.errors
# => [#<Ripper::Lexer::Elem: on_parse_error@1:8:END: "true": syntax error, unexpected `true', expecting ')'>]

This is not a bug, it is a limitation of the API for Ripper.tokenize and Ripper.lex. Changing Ripper.tokenize and Ripper.lex to raise an exception is possible, but would break backwards compatibility.

Maybe we could support keyword arguments in Ripper.lex and Ripper.tokenize to raise SyntaxError for errors? Here's a pull request for that approach: https://github.com/ruby/ruby/pull/3774

Updated by Eregon (Benoit Daloze) almost 5 years ago Actions
Copy link
#2 [ruby-core:100891]

jeremyevans0 (Jeremy Evans) wrote in #note-1:

Maybe we could support keyword arguments in Ripper.lex and Ripper.tokenize to raise SyntaxError for errors? Here's a pull request for that approach: https://github.com/ruby/ruby/pull/3774

I agree it would be nice.

Do you think the same would be possible for Ripper.sexp/sexp_raw?
Currently it just returns nil if there is some error, which is unhelpful if one wants to know why it failed to lex/parse:

> Ripper.sexp('def n')
=> nil

Updated by jeremyevans0 (Jeremy Evans) almost 5 years ago Actions
Copy link
#3 [ruby-core:100908]

Eregon (Benoit Daloze) wrote in #note-2:

jeremyevans0 (Jeremy Evans) wrote in #note-1:

Maybe we could support keyword arguments in Ripper.lex and Ripper.tokenize to raise SyntaxError for errors? Here's a pull request for that approach: https://github.com/ruby/ruby/pull/3774

I agree it would be nice.

Do you think the same would be possible for Ripper.sexp/sexp_raw?

Yes, the same is possible with Ripper.sexp/sexp_raw. I've updated the pull request to handle those as well.

Updated by jeremyevans (Jeremy Evans) almost 5 years ago Actions
Copy link
#4

Status changed from Open to Closed

Applied in changeset git|cd0877a93e91fecb3066984b3fa2a762e6977caf.

Support raise_errors keyword for Ripper.{lex,tokenize,sexp,sexp_raw}

Implements [Feature #17276]

Updated by no6v (Nobuhiro IMAI) almost 5 years ago Actions
Copy link
#5 [ruby-core:100943]

Support raise_errors keyword for Ripper.{lex,tokenize,sexp,sexp_raw}

Implements [Feature #17276]

Thanks for your clarification and implementation.
(it seems that those two lines are same :)
https://github.com/ruby/ruby/blob/cd0877a93e91fecb3066984b3fa2a762e6977caf/test/ripper/test_lexer.rb#L150-L151

Ripper::Lexer#{lex,tokenize} seem to accept second or more calls to return the rest of code as tokens.

$ cat src.rb
def req(true) end
def opt(true=0) end
def rest(*true) end
def keyrest(**true) end
def block(&true) end
->true{}
->true=0{}
->*true{}
->**true{}
->&true{}
$ cat l.rb
require "ripper"
lexer = Ripper::Lexer.new(ARGF.read)
until (tokens = lexer.tokenize).empty?
  p tokens
end
$ ruby l.rb src.rb
["def", " ", "req", "(", "true", ")"]
[" ", "end", "\n", "def", " ", "opt", "(", "true", "=", "0", ")"]
[" ", "end", "\n", "def", " ", "rest", "(", "*", "true", ")"]
[" ", "end", "\n", "def", " ", "keyrest", "(", "**", "true", ")"]
[" ", "end", "\n", "def", " ", "block", "(", "&", "true", ")"]
[" ", "end", "\n", "->", "true", "{"]
["}", "\n", "->", "true", "=", "0", "{"]
["}", "\n", "->", "*", "true", "{"]
["}", "\n", "->", "**", "true", "{"]
["}", "\n", "->", "&", "true", "{"]
["}", "\n"]

Ripper::Lexer#lex is as well. Concatenated those tokens is what I exactly wanted.
I would prefer Ripper.{lex,tokenize} returning fully parsed tokens.

Updated by no6v (Nobuhiro IMAI) almost 5 years ago Actions
Copy link
#6 [ruby-core:100946]

I would prefer Ripper.{lex,tokenize} returning fully parsed tokens.

pull request: https://github.com/ruby/ruby/pull/3791

Actions

Copy link

Also available in: PDF Atom

Project

General

Profile

Ruby

Tags

Custom queries

Feature #17276

Ripper stops tokenizing after keyword as a method parameter

Updated by jeremyevans0 (Jeremy Evans) almost 5 years ago Actions
Copy link
#1 [ruby-core:100885]

Updated by Eregon (Benoit Daloze) almost 5 years ago Actions
Copy link
#2 [ruby-core:100891]

Updated by jeremyevans0 (Jeremy Evans) almost 5 years ago Actions
Copy link
#3 [ruby-core:100908]

Updated by jeremyevans (Jeremy Evans) almost 5 years ago Actions
Copy link
#4

Updated by no6v (Nobuhiro IMAI) almost 5 years ago Actions
Copy link
#5 [ruby-core:100943]

Updated by no6v (Nobuhiro IMAI) almost 5 years ago Actions
Copy link
#6 [ruby-core:100946]

Project

General

Profile

Ruby

Tags

Custom queries

Feature #17276

Ripper stops tokenizing after keyword as a method parameter

Updated by jeremyevans0 (Jeremy Evans) almost 5 years ago ActionsCopy link #1 [ruby-core:100885]

Updated by Eregon (Benoit Daloze) almost 5 years ago ActionsCopy link #2 [ruby-core:100891]

Updated by jeremyevans0 (Jeremy Evans) almost 5 years ago ActionsCopy link #3 [ruby-core:100908]

Updated by jeremyevans (Jeremy Evans) almost 5 years ago ActionsCopy link #4

Updated by no6v (Nobuhiro IMAI) almost 5 years ago ActionsCopy link #5 [ruby-core:100943]

Updated by no6v (Nobuhiro IMAI) almost 5 years ago ActionsCopy link #6 [ruby-core:100946]

Updated by jeremyevans0 (Jeremy Evans) almost 5 years ago Actions
Copy link
#1 [ruby-core:100885]

Updated by Eregon (Benoit Daloze) almost 5 years ago Actions
Copy link
#2 [ruby-core:100891]

Updated by jeremyevans0 (Jeremy Evans) almost 5 years ago Actions
Copy link
#3 [ruby-core:100908]

Updated by jeremyevans (Jeremy Evans) almost 5 years ago Actions
Copy link
#4

Updated by no6v (Nobuhiro IMAI) almost 5 years ago Actions
Copy link
#5 [ruby-core:100943]

Updated by no6v (Nobuhiro IMAI) almost 5 years ago Actions
Copy link
#6 [ruby-core:100946]