Project

General

Profile

Backport #1921

Ripper Heredoc Parsing

Added by akeep (Andy Keep) almost 10 years ago. Updated about 8 years ago.

Status:
Closed
Priority:
Normal
[ruby-core:24855]

Description

=begin
I'm using the Ripper parser extension in a project I'm working on for school, and I noticed that heredocs do not seem to process correctly. What seems to be happening is that the scanner events are dispatched: heredoc_beg, then tstring_content, then heredoc_end, but at the parser level only the heredoc_end is coming back in the results.

To quickly demonstrate using Ripper's sexp generator:

$ irb -rripper

Ripper.sexp("<<-EOF\nThis is a\ntest of heredocs\nEOF")
=> [:program, [[:string_literal, [:string_content, [:heredoc_end, "EOF", [4, 0]]]]]]
Ripper.sexp_raw("<<-EOF\nThis is a\ntest of heredocs\nEOF")
=> [:program, [:stmts_add, [:stmts_new], [:string_literal, [:string_add, [:string_content], [:heredoc_end, "EOF", [4, 0]]]]]]

Here instead of the expected string content, "This is a\ntest of heredocs\n", we get [:heredoc_end, "EOF", [4, 0]] from the scanner.

While Ripper's sexp and sexp_raw do not seem to deliver the correct stuff, the correct scanner events are being fired off, so it is not just an issue of the scanner events begin incorrect or the content string being lost. I think the problem stems from the fact that the heredocs are handled within the lexer, and at the parser level only one value representing the string (or at least the final string) is expected, but the scanner is sending two final events. First the tstring_content then the heredoc_end, both of which overwrite the yyval. The right way to tackle this may be fundamentally change how heredocs are handled, but the simpler approach is to allow the heredoc_end to fire without it accidentally overwriting the yyval, since the parser is not expecting a heredoc value anyway.

I've included a patch to parse.y that basically adds a function called ripper_dispatch_ignored_scan_event, which mimics ripper_dispatch_scan_event, but does not update the yyval. It then uses this to dispatch the heredoc_end event to avoid overwriting the tstring_content token.

With my patch:

$ irb -rripper

Ripper.sexp("t = <<-EOF\nThis is a\ntest of heredocs\nEOF")
=> [:program, [[:assign, [:var_field, [:ident, "t", [1, 0]]], [:string_literal, [:string_content, [:tstring_content, "This is a\ntest of heredocs\n", [2, 0]]]]]]]
Ripper.sexp_raw("t = <<-EOF\nThis is a\ntest of heredocs\nEOF")
=> [:program, [:stmts_add, [:stmts_new], [:assign, [:var_field, [:ident, "t", [1, 0]]], [:string_literal, [:string_add, [:string_content], [:tstring_content, "This is a\ntest of heredocs\n", [2, 0]]]]]]]
=end


Files

parse.y.heredoc_patch (981 Bytes) parse.y.heredoc_patch Patch to fix Ripper handling of HEREDOC (generated with svn diff off rev 25401) akeep (Andy Keep), 10/20/2009 12:02 AM

Associated revisions

Revision 35d36573
Added by yugui (Yuki Sonoda) over 9 years ago

merges r25402 from trunk into ruby_1_9_1. fixes the backport task #1921.

  • parse.y (parser_here_document): dispatch delayed heredoc contents. based on a patch from Andy Keep in [ruby-core:24855].

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/branches/ruby_1_9_1@26008 b2dd03c8-39d4-4d8f-98ff-823fe69b080e

History

#1

Updated by nobu (Nobuyoshi Nakada) almost 10 years ago

  • Status changed from Open to Feedback

=begin
Where's your patch?
=end

#2

Updated by akeep (Andy Keep) almost 10 years ago

=begin
Hmmm... That is odd, I had attached the patch when I initially created this ticket. (Or at least I thought I had.) Here is a patch (updated for SVN revision 25401)
=end

#3

Updated by nobu (Nobuyoshi Nakada) almost 10 years ago

  • Status changed from Feedback to Closed
  • % Done changed from 0 to 100

=begin
This issue was solved with changeset r25402.
Andy, thank you for your reporting of the issue.
You have greatfully contributed toward Ruby.
May Ruby be with you.

=end

#4

Updated by nobu (Nobuyoshi Nakada) over 9 years ago

  • Status changed from Closed to Assigned
  • Assignee set to yugui (Yuki Sonoda)

=begin

=end

#5

Updated by yugui (Yuki Sonoda) over 9 years ago

  • Status changed from Assigned to Closed

=begin
This issue was solved with changeset r26008.
Andy, thank you for reporting this issue.
Your contribution to Ruby is greatly appreciated.
May Ruby be with you.

=end

Also available in: Atom PDF