Named ripper fields
One of the biggest pain-points working with the existing ripper subclasses is that you have to know what each array index represents in any given node. I'm proposing adding a subclass of ripper where every field is named so that it makes it easier to work with different versions of ruby. The PR is here: https://github.com/ruby/ruby/pull/5679. Below is copied the description from the PR.
This is a new subclass of the Ripper parser. It is based on/extracted from the work done in https://github.com/ruby-syntax-tree/syntax_tree. This subclass is similar to SexpBuilderPP in that it provides individual shapes per production rule from ripper. However, there are a couple of differences:
- Tree uses class instances instead of arrays to represent nodes.
- Comments are automatically attached to the various nodes when parsing is finished.
- A couple of additional nodes are added for clarity (i.e., ArgStar, Not, PinnedBegin, etc.).
- Every node has location information attached to it (as opposed to just the scanner event nodes).
- There's a standard interface for descending the tree (child_nodes).
- Additionally, each node has pattern matching (both array and hash patterns) as well as pretty_print defined to make it easier to work with.
I think we should ship this with Ruby to make it easier to build tools (like formatters and linters) on top of this structure. It also will make it easier to change the parser in the future (if we ever do) because any tools built on top of these objects will not have to worry about the specific order of the nodes (unlike the SexpBuilderPP version) since everything has a named field. Additionally, since everything is written in pure Ruby, it makes it easy for other implementations of Ruby to benefit.
Updated by mame (Yusuke Endoh) 9 months ago
We discussed this ticket at the dev meeting, but no conclusion has been reached. Here is a list of comments:
- It is a pure Ruby library. Why not gem? We need a good reason to merge it to the package.
- It is huge and therefore difficult to maintain. Can't we (semi-)automatically generate the code from parse.y?
- We don't guarantee the compatibility of AST between Ruby versions. This library could mislead users into thinking it guarantees comatibility.
Updated by kddeisz (Kevin Newton) 9 months ago
To respond to your points @mame (Yusuke Endoh):
- It can live in a gem, and it is in https://github.com/ruby-syntax-tree/syntax_tree. The issue is, it's hard to maintain completely outside the ruby/ruby repository because if things change it's hard to catch. I believe this was the same reason ripper got merged in in the first place, to keep up with upstream.
- I think maintaining code generation from parse.y is a lot harder than maintaining these classes. The classes themselves are basically just structs. Code generation is error-prone and would be very hard to debug. I tried many times to generate it from parse.y, and there were always so many edge cases.
- It's totally fine to not guarantee compatibility of the AST. That's the nice thing about these objects, is that they can wrap up incompatibility. For example, recently when block forwarding got merged, it was simply a matter of changing how the
Paramsclass was being built. By having this object layer in front, we don't have to worry about incompatibility so much.