Project

General

Profile

Actions

Bug #20465

open

parse.y adds an extra empty string to the AST

Added by tenderlovemaking (Aaron Patterson) 9 months ago. Updated 8 months ago.

Status:
Open
Assignee:
-
Target version:
-
ruby -v:
ruby 3.4.0dev (2024-05-02T15:27:18Z master 7c0cf71049) [arm64-darwin23]
[ruby-core:117752]

Description

Given this code:

t0 = '\\xc1'
"#{t0}"

The AST is like this:

$ ./miniruby --dump=parsetree test.rb
###########################################################
## Do NOT use this node dump for any purpose other than  ##
## debug and research.  Compatibility is not guaranteed. ##
###########################################################

# @ NODE_SCOPE (id: 8, line: 1, location: (1,0)-(2,7))
# +- nd_tbl: :t0
# +- nd_args:
# |   (null node)
# +- nd_body:
#     @ NODE_BLOCK (id: 6, line: 1, location: (1,0)-(2,7))
#     +- nd_head (1):
#     |   @ NODE_LASGN (id: 0, line: 1, location: (1,0)-(1,12))*
#     |   +- nd_vid: :t0
#     |   +- nd_value:
#     |       @ NODE_STR (id: 1, line: 1, location: (1,5)-(1,12))
#     |       +- string: "\\xc1"
#     +- nd_head (2):
#         @ NODE_DSTR (id: 4, line: 2, location: (2,0)-(2,7))*
#         +- string: ""
#         +- nd_next->nd_head:
#         |   @ NODE_EVSTR (id: 3, line: 2, location: (2,0)-(2,7))
#         |   +- nd_body:
#         |       @ NODE_LVAR (id: 2, line: 2, location: (2,3)-(2,5))
#         |       +- nd_vid: :t0
#         +- nd_next->nd_next:
#             (null node)

There is an empty DSTR. I don't think that DSTR should be there since it's not part of the source.

Updated by mame (Yusuke Endoh) 9 months ago

How would this be a problem?

I think it intentionally creates the structure to ensure that it returns a new String object. Unlike #20457, there is no lack of information. I don't think it is so inconvenient.

Updated by tenderlovemaking (Aaron Patterson) 9 months ago

mame (Yusuke Endoh) wrote in #note-1:

How would this be a problem?

I think it intentionally creates the structure to ensure that it returns a new String object. Unlike #20457, there is no lack of information. I don't think it is so inconvenient.

It's not a problem for the compiler (of course), but a problem for the language server use case. Users didn't write "" there, so consumers of the AST will have to detect and handle this case. If the AST is supposed to reflect the source document, then I would expect there to not be a "" node.

AFAICT, this node is only added to the AST to simplify compile.c. In my opinion, the AST should closely reflect the source and compile.c should do the work to compile it. Similar reasoning to Bug #20457.

Updated by nobu (Nobuyoshi Nakada) 8 months ago

tenderlovemaking (Aaron Patterson) wrote in #note-2:

It's not a problem for the compiler (of course), but a problem for the language server use case. Users didn't write "" there, so consumers of the AST will have to detect and handle this case. If the AST is supposed to reflect the source document, then I would expect there to not be a "" node.

I think it should be there, since there is an empty string part before #.

Actions

Also available in: Atom PDF

Like0
Like0Like0Like0