Actions
Bug #20578
closedTokenizing string literal that have newline and invalid escape is wrong
Status:
Closed
Assignee:
-
Target version:
-
ruby -v:
ruby 3.4.0dev (2024-06-13T09:49:46Z master 8b843b0dc7) [x86_64-linux]
Description
Tokenizing string literal that have newline and invalid escape is wrong
When a string literal includes \n
and an invalid escape after it, tokenize result gets wrong.
Ripper.tokenize "\"hello\\x world"
# => ["\"", "hello\\x", " world"] # looks good
Ripper.tokenize "\"\nhello\\x world"
# => ["\"", "\n world", "hello\\x"] # order is reversed
These invalid escapes also gets wrong
Ripper.tokenize("\"\n\\Cxx\"") #=> ["\"", "\nx", "\\Cx", "\""]
Ripper.tokenize("\"\n\\Mxx\"") #=> ["\"", "\nx", "\\Mx", "\""]
Ripper.tokenize("\"\n\\c\\cx\"") #=> ["\"", "\nx", "\\c\\c", "\""]
Ripper.tokenize("\"\n\\ux\"") #=> ["\"", "\nx", "\""]
Ripper.tokenize("\"\n\\xx\"") #=> ["\"", "\nx", "\\x", "\""]
And these literals also gets wrong
Ripper.tokenize("<<A\n\n\\xyz") #=> ["<<A", "\n", "\nyz", "\\x"]
Ripper.tokenize("%(\n\\xyz)") #=> ["%(", "\nyz", "\\x", ")"]
Ripper.tokenize("%Q(\n\\xyz)") #=> ["%Q(", "\nyz", "\\x", ")"]
Ripper.tokenize(":\"\n\\xyz\"") #=> [":\"", "\nyz", "\\x", "\""]
I encountered this while typing a valid string literal into IRB
irb(main):001> "
irb(main):002> \x█
Other invalid escape sequence that disappears from tokenize result
Ripper.tokenize('"\u{123')
# => ["\""]
Updated by nobu (Nobuyoshi Nakada) 5 months ago
- Status changed from Open to Closed
Applied in changeset git|2e59cf00cc35183fe9b616672cb8d2b461b1cf9b.
[Bug #20578] ripper: Fix dispatching part at invalid escapes
Actions
Like0
Like0