Bug #20465
openparse.y adds an extra empty string to the AST
Description
Given this code:
t0 = '\\xc1'
"#{t0}"
The AST is like this:
$ ./miniruby --dump=parsetree test.rb
###########################################################
## Do NOT use this node dump for any purpose other than ##
## debug and research. Compatibility is not guaranteed. ##
###########################################################
# @ NODE_SCOPE (id: 8, line: 1, location: (1,0)-(2,7))
# +- nd_tbl: :t0
# +- nd_args:
# | (null node)
# +- nd_body:
# @ NODE_BLOCK (id: 6, line: 1, location: (1,0)-(2,7))
# +- nd_head (1):
# | @ NODE_LASGN (id: 0, line: 1, location: (1,0)-(1,12))*
# | +- nd_vid: :t0
# | +- nd_value:
# | @ NODE_STR (id: 1, line: 1, location: (1,5)-(1,12))
# | +- string: "\\xc1"
# +- nd_head (2):
# @ NODE_DSTR (id: 4, line: 2, location: (2,0)-(2,7))*
# +- string: ""
# +- nd_next->nd_head:
# | @ NODE_EVSTR (id: 3, line: 2, location: (2,0)-(2,7))
# | +- nd_body:
# | @ NODE_LVAR (id: 2, line: 2, location: (2,3)-(2,5))
# | +- nd_vid: :t0
# +- nd_next->nd_next:
# (null node)
There is an empty DSTR. I don't think that DSTR should be there since it's not part of the source.
Updated by mame (Yusuke Endoh) 9 months ago
How would this be a problem?
I think it intentionally creates the structure to ensure that it returns a new String object. Unlike #20457, there is no lack of information. I don't think it is so inconvenient.
Updated by tenderlovemaking (Aaron Patterson) 9 months ago
mame (Yusuke Endoh) wrote in #note-1:
How would this be a problem?
I think it intentionally creates the structure to ensure that it returns a new String object. Unlike #20457, there is no lack of information. I don't think it is so inconvenient.
It's not a problem for the compiler (of course), but a problem for the language server use case. Users didn't write ""
there, so consumers of the AST will have to detect and handle this case. If the AST is supposed to reflect the source document, then I would expect there to not be a ""
node.
AFAICT, this node is only added to the AST to simplify compile.c. In my opinion, the AST should closely reflect the source and compile.c should do the work to compile it. Similar reasoning to Bug #20457.
Updated by nobu (Nobuyoshi Nakada) 8 months ago
tenderlovemaking (Aaron Patterson) wrote in #note-2:
It's not a problem for the compiler (of course), but a problem for the language server use case. Users didn't write
""
there, so consumers of the AST will have to detect and handle this case. If the AST is supposed to reflect the source document, then I would expect there to not be a""
node.
I think it should be there, since there is an empty string part before #
.