Project

General

Profile

Actions

Backport #1921

closed

Ripper Heredoc Parsing

Added by akeep (Andy Keep) over 14 years ago. Updated about 13 years ago.

Status:
Closed
[ruby-core:24855]

Description

=begin
I'm using the Ripper parser extension in a project I'm working on for school, and I noticed that heredocs do not seem to process correctly. What seems to be happening is that the scanner events are dispatched: heredoc_beg, then tstring_content, then heredoc_end, but at the parser level only the heredoc_end is coming back in the results.

To quickly demonstrate using Ripper's sexp generator:

$ irb -rripper

Ripper.sexp("<<-EOF\nThis is a\ntest of heredocs\nEOF")
=> [:program, [[:string_literal, [:string_content, [:heredoc_end, "EOF", [4, 0]]]]]]
Ripper.sexp_raw("<<-EOF\nThis is a\ntest of heredocs\nEOF")
=> [:program, [:stmts_add, [:stmts_new], [:string_literal, [:string_add, [:string_content], [:heredoc_end, "EOF", [4, 0]]]]]]

Here instead of the expected string content, "This is a\ntest of heredocs\n", we get [:heredoc_end, "EOF", [4, 0]] from the scanner.

While Ripper's sexp and sexp_raw do not seem to deliver the correct stuff, the correct scanner events are being fired off, so it is not just an issue of the scanner events begin incorrect or the content string being lost. I think the problem stems from the fact that the heredocs are handled within the lexer, and at the parser level only one value representing the string (or at least the final string) is expected, but the scanner is sending two final events. First the tstring_content then the heredoc_end, both of which overwrite the yyval. The right way to tackle this may be fundamentally change how heredocs are handled, but the simpler approach is to allow the heredoc_end to fire without it accidentally overwriting the yyval, since the parser is not expecting a heredoc value anyway.

I've included a patch to parse.y that basically adds a function called ripper_dispatch_ignored_scan_event, which mimics ripper_dispatch_scan_event, but does not update the yyval. It then uses this to dispatch the heredoc_end event to avoid overwriting the tstring_content token.

With my patch:

$ irb -rripper

Ripper.sexp("t = <<-EOF\nThis is a\ntest of heredocs\nEOF")
=> [:program, [[:assign, [:var_field, [:ident, "t", [1, 0]]], [:string_literal, [:string_content, [:tstring_content, "This is a\ntest of heredocs\n", [2, 0]]]]]]]
Ripper.sexp_raw("t = <<-EOF\nThis is a\ntest of heredocs\nEOF")
=> [:program, [:stmts_add, [:stmts_new], [:assign, [:var_field, [:ident, "t", [1, 0]]], [:string_literal, [:string_add, [:string_content], [:tstring_content, "This is a\ntest of heredocs\n", [2, 0]]]]]]]
=end


Files

parse.y.heredoc_patch (981 Bytes) parse.y.heredoc_patch Patch to fix Ripper handling of HEREDOC (generated with svn diff off rev 25401) akeep (Andy Keep), 10/20/2009 12:02 AM
Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0Like0