Project

General

Profile

Bug #14027

Ripper parses squiggly heredoc incorrectly

Added by mjago (Martyn Jago) almost 2 years ago. Updated almost 2 years ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Target version:
-
ruby -v:
ruby 2.5.0dev (2017-10-18 trunk 60205) [x86_64-darwin13]
[ruby-core:83343]

Description

Where you have two or more embedded expressions on the same line within a squiggly heredoc
seperated by whitespace, Ripper considers that whitespace as ignored space despite it not being at the
beginning of a line. Below is a diff of lexer output between identical heredoc and squiggly heredoc to highlight:

-[[[1, 0], :on_heredoc_beg, "<<-E", EXPR_BEG],
+[[[1, 0], :on_heredoc_beg, "<<~E", EXPR_BEG],
  [[1, 4], :on_nl, "\n", EXPR_BEG],
  [[2, 0], :on_embexpr_beg, "\#{", EXPR_BEG],
  [[2, 2], :on_int, "1", EXPR_END|EXPR_ENDARG],
  [[2, 3], :on_embexpr_end, "}", EXPR_END],
- [[2, 4], :on_tstring_content, " ", EXPR_BEG],
+ [[2, 4], :on_ignored_sp, " ", EXPR_BEG],
  [[2, 5], :on_embexpr_beg, "\#{", EXPR_BEG],
  [[2, 7], :on_int, "2", EXPR_END|EXPR_ENDARG],
  [[2, 8], :on_embexpr_end, "}", EXPR_END],
  [[2, 9], :on_tstring_content, "\n", EXPR_BEG],
  [[3, 0], :on_heredoc_end, "E", EXPR_BEG]]

Also the sexp shows the whitespace as an empty string:

[:program,
  [[:string_literal,
    [:string_content,
     [:string_embexpr, [[:@int, "1", [2, 2]]]],
-    [:@tstring_content, " ", [2, 4]],
+    [:@tstring_content, "", [2, 5]],
     [:string_embexpr, [[:@int, "2", [2, 7]]]],
     [:@tstring_content, "\n", [2, 9]]]]]]

This can be seen on head, 2.4.2, and 2.3.5

Scripts used:

ruby -rripper -rpp -e '%{<<-E\n\#{1} \#{2}\nE}'
ruby -rripper -rpp -e '%{<<~E\n\#{1} \#{2}\nE}'

Associated revisions

Revision 0e7936f8
Added by nobu (Nobuyoshi Nakada) almost 2 years ago

lexer.rb: no dedent strings in middle

  • ext/ripper/lib/ripper/lexer.rb (on_heredoc_dedent): dedent only strings at the beginning, not strings in middle. [ruby-core:83343] [Bug #14027]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60212 b2dd03c8-39d4-4d8f-98ff-823fe69b080e

Revision 60212
Added by nobu (Nobuyoshi Nakada) almost 2 years ago

lexer.rb: no dedent strings in middle

  • ext/ripper/lib/ripper/lexer.rb (on_heredoc_dedent): dedent only strings at the beginning, not strings in middle. [ruby-core:83343] [Bug #14027]

Revision 60212
Added by nobu (Nobuyoshi Nakada) almost 2 years ago

lexer.rb: no dedent strings in middle

  • ext/ripper/lib/ripper/lexer.rb (on_heredoc_dedent): dedent only strings at the beginning, not strings in middle. [ruby-core:83343] [Bug #14027]

Revision 60212
Added by nobu (Nobuyoshi Nakada) almost 2 years ago

lexer.rb: no dedent strings in middle

  • ext/ripper/lib/ripper/lexer.rb (on_heredoc_dedent): dedent only strings at the beginning, not strings in middle. [ruby-core:83343] [Bug #14027]

History

#1

Updated by mjago (Martyn Jago) almost 2 years ago

  • Description updated (diff)

Updated by nobu (Nobuyoshi Nakada) almost 2 years ago

  • Status changed from Open to Feedback
  • Description updated (diff)

I can see only difference at the first line.

$ diff -u <(./ruby -v -rripper -rpp -e 'pp Ripper.lex("%{<<-E\n\#{1} \#{2}\nE}")') \
  <(./ruby -v -rripper -rpp -e 'pp Ripper.lex("%{<<~E\n\#{1} \#{2}\nE}")')
--- /dev/fd/63  2017-10-19 07:10:01.000000000 +0900
+++ /dev/fd/62  2017-10-19 07:10:01.000000000 +0900
@@ -1,6 +1,6 @@
 ruby 2.5.0dev (2017-10-18 trunk 60207) [x86_64-darwin15]
 [[[1, 0], :on_tstring_beg, "%{", EXPR_BEG],
- [[1, 2], :on_tstring_content, "<<-E\n", EXPR_BEG],
+ [[1, 2], :on_tstring_content, "<<~E\n", EXPR_BEG],
  [[2, 0], :on_embexpr_beg, "\#{", EXPR_BEG],
  [[2, 2], :on_int, "1", EXPR_END|EXPR_ENDARG],
  [[2, 3], :on_embexpr_end, "}", EXPR_END],

Updated by mjago (Martyn Jago) almost 2 years ago

I'm very sorry nobu, I copy and pasted the wrong script.
This is what I meant to express:

diff -u <(ruby -v -rripper -rpp -e 'pp Ripper.lex("<<-E\n\#{1} \#{2}\nE")')  \ 
<(ruby -v -rripper -rpp -e 'pp Ripper.lex("<<~E\n\#{1} \#{2}\nE")')

diff -u <(ruby -v -rripper -rpp -e 'pp Ripper.sexp("<<-E\n\#{1} \#{2}\nE")')  \
<(ruby -v -rripper -rpp -e 'pp Ripper.sexp("<<~E\n\#{1} \#{2}\nE")')
#4

Updated by nobu (Nobuyoshi Nakada) almost 2 years ago

  • Status changed from Feedback to Closed

Applied in changeset trunk|r60212.


lexer.rb: no dedent strings in middle

  • ext/ripper/lib/ripper/lexer.rb (on_heredoc_dedent): dedent only strings at the beginning, not strings in middle. [ruby-core:83343] [Bug #14027]

Also available in: Atom PDF