Feature #8321

Ripper: I would like coordinates for keywords

Added by Eric Promislow over 2 years ago. Updated over 2 years ago.

[ruby-core:54559]
Status:Open
Priority:Normal
Assignee:-

Description

=begin
Ripper gives the (({[line, column]})) coordinates for identifiers, strings, and numbers.

I would like it if it appended those coordinates to most of the block keywords,
including (({:program})), (({:if})), (({:while})), (({:unless})), (({:end})), (({:def})), (({:class})), (({:module})), etc. As with the
identifiers, it should go at the end. So an (({if}))-block would be represented as
[0] :if
[1] CONDITION
[2] BLOCK
[3] [:elsif, ...] || [:else, ...] || nil
[4] [lineNo, colNo] # location of the leading :if/:elsif/:else/:unless

I currently get the first coordinate of ((%CONDITION%)), and then look up the preceding
(({:if}))/(({:elsif}))/(({:else})) using (({Ripper.lex(src).find_all{|posn kwd name| kwd == :on_kw && %w/if else elsif/.include?(name) }}))

So the info is in Ripper. It would be more convenient if I could get that info in the src tree.

Note that my suggestion won't break (most) existing code, as the new data goes at the end
of the list.

The same would be useful for other keywords, including (({:module})) (({:class})) (({:def})) (({:try})) (({:catch})) (({:begin})) (({:rescue})).
=end

History

#1 Updated by Nobuyoshi Nakada over 2 years ago

  • Description updated (diff)

Could you illustrate?

#2 Updated by Bozhidar Batsov over 2 years ago

Here's an example:

[6] pry(main)> Ripper.sexp('alias :some :test')
=> [:program,
[[:alias,
[:symbol_literal, [:symbol, [:@ident, "some", [1, 7]]]],
[:symbol_literal, [:symbol, [:@ident, "test", [1, 13]]]]]]]

Using the first identifier position is the common workaround for this problem, but it's a bit unreliable, since it might not be on the same line. That's why I generally use Ripper.lex to get exact keyword positions.

#3 Updated by Nobuyoshi Nakada over 2 years ago

=begin
Do you mean changing following code from ((|current|))' to((|proposal|))'?

*code
Ripper.sexp(<<SRC)
if some
test
elsif yet
another
else
test
end
SRC

*current
[:program,
[
[:if,
[:vcall, [:@ident, "some", [1, 3]]],
[
[:vcall, [:@ident, "test", [2, 2]]]
],
[:elsif,
[:vcall, [:@ident, "yet", [3, 6]]],
[
[:vcall, [:@ident, "another", [4, 2]]]
],
[:else,
[
[:vcall, [:@ident, "test", [6, 2]]]
]
]
]
]
]
]

*proposal
[:program,
[
[:if,
[:vcall, [:@ident, "some", [1, 3]]],
[
[:vcall, [:@ident, "test", [2, 2]]]
],
[:elsif,
[:vcall, [:@ident, "yet", [3, 6]]],
[
[:vcall, [:@ident, "another", [4, 2]]]
],
[:else,
[
[:vcall, [:@ident, "test", [6, 2]]]
],
[6, 0] # location of if'
],
[3, 0] # location of
elsif'
],
[1, 0] # location of `else'
]
]
]

Or do you want the end location of `(({vcall}))'s?
=end

#4 Updated by Bozhidar Batsov over 2 years ago

The example outlined in the proposal is exactly what I think we need. A lot of Ruby code analysis tools need exact coordinates of keywords and currently there is no easy and reliable way to get them. This is even more important, now that you've rejected #8383

#5 Updated by Magnus Holm over 2 years ago

On Fri, May 10, 2013 at 9:13 AM, bozhidar (Bozhidar Batsov) <
bozhidar@batsov.com> wrote:

Issue #8321 has been updated by bozhidar (Bozhidar Batsov).

The example outlined in the proposal is exactly what I think we need. A
lot of Ruby code analysis tools need exact coordinates of keywords and
currently there is no easy and reliable way to get them. This is even more
important, now that you've rejected #8383

You can use the parser gem: https://github.com/whitequark/parser

It gives you a nice AST together with source maps:

require 'parser/current'
ast = Parser::CurrentRuby.parse('if a; b; else; c; end')

(if
(send nil :a)
(send nil :b)
(send nil :c))

ast.source_map.expression # => 0...21 (the whole expression)
ast.source_map.keyword # => 0...2 (the "if")
ast.source_map.begin # => 4...5 (the ";")
ast.source_map.else # => 9...13 (the "else")
ast.source_map.end # => 18...12 (the "end")

And you can then check the children for more data:

(cond, tbranch, fbranch) = *ast

cond.source_map.expression # => 3...4 ("a")

All of these return a Source::Range which allows you to extract the source
and lineno/column:

cond.source_map.expression.to_source # => "a"
cond.source_map.expression.line # => 1
cond.source_map.expression.column # => 3

Parser also supports 1.8, 1.9, 2.0 and 2.1/trunk, and it ships with a
rewriter tool:
http://whitequark.org/blog/2013/04/26/lets-play-with-ruby-code/

#6 Updated by Bozhidar Batsov over 2 years ago

I'm keeping an eye on Parser and I hope that one day we'll be able to use it in RuboCop https://github.com/bbatsov/rubocop to be able to support all Rubies out there.

As far as I know, however, Parser is now yet production ready, so I'd still like to see Ripper get some improvements in the meantime.

Also available in: Atom PDF