https://bugs.ruby-lang.org/
https://bugs.ruby-lang.org/favicon.ico?1711330511
2022-10-19T11:42:14Z
Ruby Issue Tracking System
Ruby master - Feature #19070: Enhance keep_tokens option for RubyVM::AbstractSyntaxTree parsing methods
https://bugs.ruby-lang.org/issues/19070?journal_id=99730
2022-10-19T11:42:14Z
Eregon (Benoit Daloze)
<ul></ul><p>Doesn't <code>Ripper.lex</code> already provide this information?</p>
Ruby master - Feature #19070: Enhance keep_tokens option for RubyVM::AbstractSyntaxTree parsing methods
https://bugs.ruby-lang.org/issues/19070?journal_id=99967
2022-11-07T03:07:32Z
matz (Yukihiro Matsumoto)
matz@ruby.or.jp
<ul></ul><p>Sounds OK.</p>
<p>Matz.</p>
Ruby master - Feature #19070: Enhance keep_tokens option for RubyVM::AbstractSyntaxTree parsing methods
https://bugs.ruby-lang.org/issues/19070?journal_id=100185
2022-11-21T00:01:53Z
yui-knk (Kaneko Yuichiro)
<ul><li><strong>Status</strong> changed from <i>Open</i> to <i>Closed</i></li></ul><p>Applied in changeset <a class="changeset" title="Enhance keep_tokens option for RubyVM::AbstractSyntaxTree parsing methods Implementation for Lan..." href="https://bugs.ruby-lang.org/projects/ruby-master/repository/git/revisions/d8601621edcf29e3323b90dcf04b774edd9fb45e">git|d8601621edcf29e3323b90dcf04b774edd9fb45e</a>.</p>
<hr>
<p>Enhance keep_tokens option for RubyVM::AbstractSyntaxTree parsing methods</p>
<p>Implementation for Language Server Protocol (LSP) sometimes needs token information.<br>
For example both <code>m(1)</code> and <code>m(1, )</code> has same AST structure other than node locations<br>
then it's impossible to check the existence of <code>,</code> from AST. However in later case,<br>
it might be better to suggest variables list for the second argument.<br>
Token information is important for such case.</p>
<p>This commit adds these methods.</p>
<ul>
<li>Add <code>keep_tokens</code> option for <code>RubyVM::AbstractSyntaxTree.parse</code>, <code>.parse_file</code> and <code>.of</code>
</li>
<li>Add <code>RubyVM::AbstractSyntaxTree::Node#tokens</code> which returns tokens for the node including tokens for descendants nodes.</li>
<li>Add <code>RubyVM::AbstractSyntaxTree::Node#all_tokens</code> which returns all tokens for the input script regardless the receiver node.</li>
</ul>
<p>[Feature <a class="issue tracker-2 status-5 priority-4 priority-default closed" title="Feature: Enhance keep_tokens option for RubyVM::AbstractSyntaxTree parsing methods (Closed)" href="https://bugs.ruby-lang.org/issues/19070">#19070</a>]</p>
<p>Impacts on memory usage and performance are below:</p>
<p>Memory usage:</p>
<pre><code>$ cat test.rb
root = RubyVM::AbstractSyntaxTree.parse_file(File.expand_path('../test/ruby/test_keyword.rb', __FILE__), keep_tokens: true)
$ /usr/bin/time -f %Mkb /usr/local/bin/ruby -v
ruby 3.2.0dev (2022-11-19T09:41:54Z 19070-keep_tokens d3af1b8057) [x86_64-linux]
11408kb
# keep_tokens :false
$ /usr/bin/time -f %Mkb /usr/local/bin/ruby test.rb
17508kb
# keep_tokens :true
$ /usr/bin/time -f %Mkb /usr/local/bin/ruby test.rb
30960kb
</code></pre>
<p>Performance:</p>
<pre><code>$ cat ../ast_keep_tokens.yml
prelude: |
src = <<~SRC
module M
class C
def m1(a, b)
1 + a + b
end
end
end
SRC
benchmark:
without_keep_tokens: |
RubyVM::AbstractSyntaxTree.parse(src, keep_tokens: false)
with_keep_tokens: |
RubyVM::AbstractSyntaxTree.parse(src, keep_tokens: true)
$ make benchmark COMPARE_RUBY="./ruby" ARGS=../ast_keep_tokens.yml
/home/kaneko.y/.rbenv/shims/ruby --disable=gems -rrubygems -I../benchmark/lib ../benchmark/benchmark-driver/exe/benchmark-driver \
--executables="compare-ruby::./ruby -I.ext/common --disable-gem" \
--executables="built-ruby::./miniruby -I../lib -I. -I.ext/common ../tool/runruby.rb --extout=.ext -- --disable-gems --disable-gem" \
--output=markdown --output-compare -v ../ast_keep_tokens.yml
compare-ruby: ruby 3.2.0dev (2022-11-19T09:41:54Z 19070-keep_tokens d3af1b8057) [x86_64-linux]
built-ruby: ruby 3.2.0dev (2022-11-19T09:41:54Z 19070-keep_tokens d3af1b8057) [x86_64-linux]
warming up..
| |compare-ruby|built-ruby|
|:--------------------|-----------:|---------:|
|without_keep_tokens | 21.659k| 21.303k|
| | 1.02x| -|
|with_keep_tokens | 6.220k| 5.691k|
| | 1.09x| -|
</code></pre>