Bug #18738
closedIRB can't recognize heredoc after words
Description
My irb_info
irb(main):001:0> irb_info
=> 
Ruby version: 3.1.1                            
IRB version: irb 1.4.1 (2021-12-25)            
InputMethod: ReidlineInputMethod with Reline 0.3.1
RUBY_PLATFORM: x86_64-linux                    
LANG env: en_US.UTF-8                          
East Asian Ambiguous Width: 1  
See the code below please.
a, b = <<EOF, %w[ hello
thank you
ruby devs
EOF
world
]
p a
p b
This works well if you save it to a file, and run with ruby xxx.rb. The results are here:
"thank you\nruby devs\n"
["hello", "world"]
But when you type it to irb,the code will not end, and you will get:
❯ irb
irb(main):001:0] a, b = <<EOF, %w[ hello
irb(main):002:0] thank you
irb(main):003:0] ruby devs
irb(main):004:0] EOF
irb(main):005:0] world
irb(main):006:0" ]
irb(main):007:-" 
irb(main):008:0" 
irb(main):009:0" 
irb(main):010:0" 
I found this issue when I read the mruby source code. in mruby, the token after the first line's hello should be tHD_LITERAL_DELIM. But in CRuby, there's no this token. I tried to dump CRuby's parser state, find that just after reading <<EOF,it will directly recognize the whole token 'thank you\nruby\devs'. So, I think this may not be the bug of Ripper, but how IRB called Ripper using its ruby-lex line by line.
For your convenience, you can see the parser state.
Stack now 0 2 82 341
Entering state 580
Next token is token "string literal" (1.7-1.12: )
Shifting token "string literal" (1.7-1.12: )
Entering state 60
Reducing stack by rule 613 (line 4830):
-> $$ = nterm string_contents (1.12-1.12: )
Stack now 0 2 82 341 580 60
Entering state 301
Reading a token: Next token is token "literal content" (1.12-1.12: "thank you\nruby devs\n")
Shifting token "literal content" (1.12-1.12: "thank you\nruby devs\n")
Entering state 507
Reducing stack by rule 619 (line 4926):
   $1 = token "literal content" (1.12-1.12: "thank you\nruby devs\n")
-> $$ = nterm string_content (1.12-1.12: )
Stack now 0 2 82 341 580 60 301
Entering state 511
Reducing stack by rule 614 (line 4840):
   $1 = nterm string_contents (1.12-1.12: )
   $2 = nterm string_content (1.12-1.12: )
-> $$ = nterm string_contents (1.12-1.12: )
Stack now 0 2 82 341 580 60
Entering state 301
Reading a token: 
lex_state: BEG -> END at line 7453
Next token is token "terminator" (1.12-1.12: )
Shifting token "terminator" (1.12-1.12: )
Entering state 512
Reducing stack by rule 596 (line 4693):
   $1 = token "string literal" (1.7-1.12: )
   $2 = nterm string_contents (1.12-1.12: )
   $3 = token "terminator" (1.12-1.12: )
-> $$ = nterm string1 (1.7-1.12: )
Stack now 0 2 82 341 580
Entering state 109
Reducing stack by rule 594 (line 4683):
   $1 = nterm string1 (1.7-1.12: )
-> $$ = nterm string (1.7-1.12: )
Stack now 0 2 82 341 580
Entering state 108
Reading a token: 
lex_state: END -> BEG|LABEL at line 9814
Next token is token ',' (1.12-1.13: )
        
           Updated by kaiquekandykoga (Kaíque Koga) over 3 years ago
          Updated by kaiquekandykoga (Kaíque Koga) over 3 years ago
          
          
        
        
      
      I think it can be interesting to open an issue at https://github.com/ruby/irb.
        
           Updated by ccmywish (Aoran Zeng) over 3 years ago
          Updated by ccmywish (Aoran Zeng) over 3 years ago
          
          
        
        
      
      kaiquekandykoga (Kaíque Koga) wrote in #note-1:
I think it can be interesting to open an issue at https://github.com/ruby/irb.
Now here: https://github.com/ruby/irb/issues/361
        
           Updated by ccmywish (Aoran Zeng) over 2 years ago
          Updated by ccmywish (Aoran Zeng) over 2 years ago
          
          
        
        
      
      This has been fixed. Please close it.
        
           Updated by jeremyevans0 (Jeremy Evans) over 2 years ago
          Updated by jeremyevans0 (Jeremy Evans) over 2 years ago
          
          
        
        
      
      - Status changed from Open to Closed