Feature #8321

Ripper: I would like coordinates for keywords

Added by Eric Promislow 12 months ago. Updated 12 months ago.

[ruby-core:54559]
Status:Open
Priority:Normal
Assignee:-
Category:-
Target version:-

Description

=begin
Ripper gives the (({[line, column]})) coordinates for identifiers, strings, and numbers.

I would like it if it appended those coordinates to most of the block keywords,
including (({:program})), (({:if})), (({:while})), (({:unless})), (({:end})), (({:def})), (({:class})), (({:module})), etc. As with the
identifiers, it should go at the end. So an (({if}))-block would be represented as
[0] :if
[1] CONDITION
[2] BLOCK
[3] [:elsif, ...] || [:else, ...] || nil
[4] [lineNo, colNo] # location of the leading :if/:elsif/:else/:unless

I currently get the first coordinate of ((%CONDITION%)), and then look up the preceding
(({:if}))/(({:elsif}))/(({:else})) using (({Ripper.lex(src).findall{|posn kwd name| kwd == :onkw && %w/if else elsif/.include?(name) }}))

So the info is in Ripper. It would be more convenient if I could get that info in the src tree.

Note that my suggestion won't break (most) existing code, as the new data goes at the end
of the list.

The same would be useful for other keywords, including (({:module})) (({:class})) (({:def})) (({:try})) (({:catch})) (({:begin})) (({:rescue})).
=end

History

#1 Updated by Nobuyoshi Nakada 12 months ago

  • Description updated (diff)

Could you illustrate?

#2 Updated by Bozhidar Batsov 12 months ago

Here's an example:

[6] pry(main)> Ripper.sexp('alias :some :test')
=> [:program,
[[:alias,
[:symbolliteral, [:symbol, [:@ident, "some", [1, 7]]]],
[:symbol
literal, [:symbol, [:@ident, "test", [1, 13]]]]]]]

Using the first identifier position is the common workaround for this problem, but it's a bit unreliable, since it might not be on the same line. That's why I generally use Ripper.lex to get exact keyword positions.

#3 Updated by Nobuyoshi Nakada 12 months ago

=begin
Do you mean changing following code from ((|current|))' to((|proposal|))'?

*code
Ripper.sexp(<<SRC)
if some
test
elsif yet
another
else
test
end
SRC

*current
[:program,
[
[:if,
[:vcall, [:@ident, "some", [1, 3]]],
[
[:vcall, [:@ident, "test", [2, 2]]]
],
[:elsif,
[:vcall, [:@ident, "yet", [3, 6]]],
[
[:vcall, [:@ident, "another", [4, 2]]]
],
[:else,
[
[:vcall, [:@ident, "test", [6, 2]]]
]
]
]
]
]
]

*proposal
[:program,
[
[:if,
[:vcall, [:@ident, "some", [1, 3]]],
[
[:vcall, [:@ident, "test", [2, 2]]]
],
[:elsif,
[:vcall, [:@ident, "yet", [3, 6]]],
[
[:vcall, [:@ident, "another", [4, 2]]]
],
[:else,
[
[:vcall, [:@ident, "test", [6, 2]]]
],
[6, 0] # location of if'
],
[3, 0] # location of
elsif'
],
[1, 0] # location of `else'
]
]
]

Or do you want the end location of `(({vcall}))'s?
=end

#4 Updated by Bozhidar Batsov 12 months ago

The example outlined in the proposal is exactly what I think we need. A lot of Ruby code analysis tools need exact coordinates of keywords and currently there is no easy and reliable way to get them. This is even more important, now that you've rejected #8383

#5 Updated by Magnus Holm 12 months ago

On Fri, May 10, 2013 at 9:13 AM, bozhidar (Bozhidar Batsov) <
bozhidar@batsov.com> wrote:

Issue #8321 has been updated by bozhidar (Bozhidar Batsov).

The example outlined in the proposal is exactly what I think we need. A
lot of Ruby code analysis tools need exact coordinates of keywords and
currently there is no easy and reliable way to get them. This is even more
important, now that you've rejected #8383

You can use the parser gem: https://github.com/whitequark/parser

It gives you a nice AST together with source maps:

require 'parser/current'
ast = Parser::CurrentRuby.parse('if a; b; else; c; end')

(if
(send nil :a)
(send nil :b)
(send nil :c))

ast.sourcemap.expression # => 0...21 (the whole expression)
ast.source
map.keyword # => 0...2 (the "if")
ast.sourcemap.begin # => 4...5 (the ";")
ast.source
map.else # => 9...13 (the "else")
ast.source_map.end # => 18...12 (the "end")

And you can then check the children for more data:

(cond, tbranch, fbranch) = *ast

cond.source_map.expression # => 3...4 ("a")

All of these return a Source::Range which allows you to extract the source
and lineno/column:

cond.sourcemap.expression.tosource # => "a"
cond.sourcemap.expression.line # => 1
cond.source
map.expression.column # => 3

Parser also supports 1.8, 1.9, 2.0 and 2.1/trunk, and it ships with a
rewriter tool:
http://whitequark.org/blog/2013/04/26/lets-play-with-ruby-code/

#6 Updated by Bozhidar Batsov 12 months ago

I'm keeping an eye on Parser and I hope that one day we'll be able to use it in RuboCop https://github.com/bbatsov/rubocop to be able to support all Rubies out there.

As far as I know, however, Parser is now yet production ready, so I'd still like to see Ripper get some improvements in the meantime.

Also available in: Atom PDF