Feature #22111
openNon-symbolic hash keys with `expr : value` syntax
Description
Non-symbolic hash keys with expr : value syntax¶
Allow expr : value for non-symbolic hash keys.
Almost 20 years in the making, the missing puzzle piece for Hash's "new" colon syntax:
h = {
name: "symbol shorthand", # Ruby 1.9+
"quoted label": "symbol label", # Ruby 2.2+ (Feature #4935)
value_omission: , # Ruby 3.1+ (Feature #14579)
expr : "non symbol", # THIS PROPOSAL
}
Motivation¶
Ruby has several colon-based hash key syntaxes for symbolic keys, but the => ("hash rocket") is still needed for non-symbolic keys.
Adding expr : value is a backwards compatible leap forward toward allowing all-colon hashes for all key types:
## Before -- mixed styles
n = 42; { key1: "symbol", "key-#{2}": "quoted symbol", RUBY_VERSION:, n => "bar" }.keys
# => [:key1, :"key-2", :RUBY_VERSION, 42]
## After -- uniform colon syntax
n = 42; { key1: "symbol", "key-#{2}": "quoted symbol", RUBY_VERSION:, n : "bar" }.keys
# => [:key1, :"key-2", :RUBY_VERSION, 42]
Could this be what we need to one day retire our old friend "hash rocket" from Hashes entirely?
Completing the colon family¶
Ruby has two hash key syntax families: => (hash rocket) and : (hash colon).
| Key form | Rocket syntax | Colon syntax |
|---|---|---|
| Static symbol | :foo => value |
foo: value |
| Quoted symbol | :"foo" => value |
"foo": value |
| Value omission | N/A | name: |
| Non-symbolic | expr => value |
❌ expr : value (proposed) |
The gap for non-symbolic keys means they're forced to use rocket syntax.
This proposal is to finally complete the hash colon family for all key types.
| Example | Ruby | Feature | Key type |
|---|---|---|---|
{ name: value } |
v1.9 | Symbol | |
{ "quoted label": value } |
v2.2 | Feature #4935 | Symbol (quoted label) |
{ value_omission: } |
v3.1 | Feature #14579 | Value omission |
{ expr : value } |
??? | Feature #22111 | Non-symbolic |
Reducing => overloading¶
The => token is now being used to serve more purposes than just the Hash literal (aka "Hash rocket") it was originally used for.
- rightward assignment
- pattern capture
- rescue variables
Reducing the rocket's use in Hashes simplifies the language, especially for newcomers.
Design¶
Disambiguation¶
A symbol eligible expression (bareword identifier or quoted string) followed by : with no space becomes a symbol key.
The same expression followed by : with a space cannot be a label, so it becomes a computed key:
{ a: 1 } # => {:a => 1} symbol
{ a : 1 } # => {1 => 1} where variable `a=1` (expr-colon where the space disambiguates from a symbol)
{ "a": 1 } # => {:a => 1} quoted symbol key
{ "a" : 1 } # => {"a" => 1} string (expr-colon where the space disambiguates from a symbol)
Expressions that cannot become symbols will work with or without a space, as there is no ambiguity to resolve:
{ 42 : 1 } # => {42 => 1}
{ 42: 1 } # => {42 => 1}
{ Math::PI : 1 } # => {3.141592653589793 => 1}
{ Math::PI: 1 } # => {3.141592653589793 => 1}
Relationship to Feature #22108¶
The earlier proposal Feature #22108 suggested { (expr): value } using parenthesized expressions with a lexer-generated label token (tLABEL_END).
I assumed there'd be too many dragons to fight with whitespace sensitivity.
However, after making the code changes somehow it just worked for all the test cases I threw at it. ¯\(ツ)/¯
Feature #22108 (expr): value |
Feature #22111 expr : value |
|
|---|---|---|
| Parser changes | Lexer + grammar | Grammar only |
| New fields | hash_nest |
None |
| LALR conflicts | 0 | 0 |
| Parens required? | Yes | No |
| Interpolated key | ("key-#{n}"): val |
"key-#{n}" : val |
| Integer key | (42): val |
42: val |
This version is strictly more general, has a simpler implementation, and requires no lexer changes.
More examples¶
key3 = "key3"
def key12 = "key12"
key13 = -> { "key13" }
h = {
key1: 1,
"key-2": :two,
key3 : "3-expr",
"key4" : "4-String",
(5+0): "5-parentheses",
6: "6-Integer",
7.001: "7-Float",
"key" + "8": "8-String expr",
9+0: "9-Integer expr",
[10, 0]: "10-Array",
true ? 11 : 0 : "11-ternary",
key12(): "12-method",
key13[]: "13-lambda[]",
-> { "key14" }.call: "14-lambda.call"
}
p h
#=> {key1: 1, "key-2": :two, "key3" => "3-expr", "key4" => "4-String", 5 => "5-parentheses", 6 => "6-Integer", 7.001 => "7-Float", "key8" => "8-String expr", 9 => "9-Integer expr", [10, 0] => "10-Array", 11 => "11-ternary", "key12" => "12-method", "key13" => "13-lambda[]", "key14" => "14-lambda.call"}
p h.keys
#=> [:key1, :"key-2", "key3", "key4", 5, 6, 7.001, "key8", 9, [10, 0], 11, "key12", "key13", "key14"]
Familiar to developers from other languages¶
Python dicts have always allowed any hashable key types with : syntax.
With this proposal something like {200: "OK", 404: "Not Found"} can be used in either language.
Edge cases¶
{ %"a": 1 } # => {"a" => 1} percent string as computed key
{ :a: 1 } # => {a: 1} symbol as a symbolic key
{ :a : 1 } # => {a: 1} symbol as a symbolic key
{ (:a): 1 } # => {a: 1} symbol as a symbolic key
{ :"a": 1 } # => {a: 1} quoted symbol as a symbolic key
{ n : } # syntax error value omission not supported
{ n : 1, } # => {42 => 1} trailing comma ok
Implementation¶
Two files (plus tests):
parse.y: One new production inassoc:| arg_value ':' arg_valueprism/prism.c: Inparse_assocs, acceptPM_TOKEN_COLONwhenpm_symbol_node_label_preturns false
Zero lexer changes, zero new fields, zero LALR conflicts.
Reference implementation: feature/expr-colon-hash-keys PR on GitHub.
Historical context¶
A version of this was discussed on ruby-core in October 2007 (as part of "General hash keys for colon notation", murphy).
Unfortunately it was brought up during the v1.9 feature freeze, but it looks like Matz's invitation to discuss for v2.0 didn't end up going anywhere.
Since then, {"quoted label": value} (Ruby v2.2, Feature #4935) and { value_omission: } (Ruby v3.1, Feature #14579) have expanded the colon family, making the computed-key gap more conspicuous.
This proposal fills that gap with a minimal grammar change that requires no new lexer states.
Open questions¶
- Is this syntax acceptable to the community?
Future directions¶
All colon-based key syntaxes would now be available.
This opens up the possibility of eventually deprecating => from Hash literals (while keeping it for rescue, pattern matching, and rightward assignment).
This proposal does not require that change — it is simply the enabling step, and any deprecation timeline could be a separate discussion.
Updated by yertto (_ yertto) 22 days ago
- Description updated (diff)
Updated by yertto (_ yertto) 22 days ago
- Tracker changed from Bug to Feature
- Backport deleted (
3.3: UNKNOWN, 3.4: UNKNOWN, 4.0: UNKNOWN)
Updated by yertto (_ yertto) 22 days ago
- Description updated (diff)
Updated by yertto (_ yertto) 22 days ago
- Description updated (diff)
Updated by yertto (_ yertto) 22 days ago
- Description updated (diff)
Updated by yertto (_ yertto) 22 days ago
- Description updated (diff)
Updated by yertto (_ yertto) 22 days ago
- Description updated (diff)
Updated by yertto (_ yertto) 22 days ago
- Description updated (diff)
Updated by yertto (_ yertto) 22 days ago
- Description updated (diff)
Updated by yertto (_ yertto) 22 days ago
- Description updated (diff)
Updated by yertto (_ yertto) 22 days ago
- Description updated (diff)
Updated by yertto (_ yertto) 21 days ago
- Description updated (diff)
Updated by yertto (_ yertto) 21 days ago
- Description updated (diff)
Updated by yertto (_ yertto) 21 days ago
- Description updated (diff)
Updated by yertto (_ yertto) 21 days ago
- Description updated (diff)
Updated by yertto (_ yertto) 21 days ago
- Description updated (diff)
Updated by yertto (_ yertto) 21 days ago
- Description updated (diff)
Updated by yertto (_ yertto) 21 days ago
- Description updated (diff)
Updated by yertto (_ yertto) 21 days ago
- Description updated (diff)
Updated by yertto (_ yertto) 21 days ago
- Description updated (diff)
Updated by yertto (_ yertto) 21 days ago
- Description updated (diff)
Updated by yertto (_ yertto) 21 days ago
- Description updated (diff)
Updated by yertto (_ yertto) 21 days ago
- Description updated (diff)
Updated by yertto (_ yertto) 21 days ago
- Description updated (diff)
Updated by yertto (_ yertto) 21 days ago
- Description updated (diff)
Updated by yertto (_ yertto) 21 days ago
- Description updated (diff)
Updated by yertto (_ yertto) 21 days ago
- Description updated (diff)
Updated by nobu (Nobuyoshi Nakada) 21 days ago
Updated by yertto (_ yertto) 21 days ago
- Description updated (diff)
Updated by shyouhei (Shyouhei Urabe) 21 days ago
yertto (_ yertto) wrote:
Could this be what we need to one day retire our old friend "hash rocket" from Hashes entirely?
I don't think we retire hash rockets only to make things confusing. Tell us why you want that happen.
Updated by yertto (_ yertto) 20 days ago
· Edited
These look very confusing.
Fair - I should have included a = 1 - but yes - agreed - they do look confusing in isolation.
Could the confusion be unfamiliarity, rather than complexity though?
If we compare:
to:
... then perhaps a newcomer could get confused by both.
However, with this : proposal that could also be written as a "computed key":
(ie. similar to Javascript's computed property names, or jq's expression keys)
Personally, I'd reach for expr : value over (expr): value even in the ambiguous case,
but (expr): would be a valid alternative for style guides that prefer explicit grouping.
Note
Quick side question while I'm fortunate enough to have some ruby-core folks in a discussion here...
Why would something like this work:
case {:name => "Alice", :role => "admin"}
in {name: String => admin_name, role: "admin"} # capture the name String value into `admin_name`
p admin_name # => "Alice"
end
but this version of the same code would fail:
case {:name => "Alice", :role => "admin"}
in {name: String => admin_name, :role => "admin"} # throws a syntax error because `:role => "admin"` is used instead of `role: "admin"`
p admin_name
end
It's because, despite the Pattern Expression looking like a Hash, it can't use the old :role => "admin" Hash rocket - right?
May I ask why?
Was there a design decision made here on purpose, or by accident, or was it just giving the parser too much of an identity crisis to attempt to parse "hash" rockets right alongside the "rightward assignment" rockets?
... it seems to me that situations like this are the reason why it could be beneficial to the language if the use of hash rockets slowly faded.
However, to be clear, I'm not asking for Hash's => to be deprecated here. (That's a separate discussion for another day.)
This proposal is simply about expanding : so we have the option of using one separator instead of two.
Updated by yertto (_ yertto) 13 days ago
· Edited
BTW, what I was alluding to in my side question above was how adding "hash colon" syntax (eg. "key" : value) to Hash could unlock pattern matching on string-keyed values.
Previously in {key: ...} only matched symbol keys, so parsed JSON, API responses, and other data with string keys required a recursive deep_symbolize_keys or transform_keys(&:to_sym) pass before you could pattern match on them.
So PR #17440 is a reference implementation of what using "hash colon" syntax could look like in practice.
The number of times I've wished this would "just work" in a simple, plain old ruby script... well now it does!
require "json"
accounts = JSON.parse(<<~JSON)
[
{"admin": {"first name": "Alice", "age": 30, "privileges": {"read": true, "write": true, "execute": false}}},
{"user": {"first name": "Bobby", "age": 25}}
]
JSON
accounts.each do |account|
case account
in {"admin" : {"first name" : String => name, "privileges" : {"write" : true}}}
puts "#{name} is an admin with write access"
in {"user" : {"first name" : String => name}}
puts "#{name} is a regular user"
end
end
# => Alice is an admin with write access
# => Bobby is a regular user
(This alone would make me a happier Ruby developer.)
Updated by yertto (_ yertto) 8 days ago
· Edited
Looking for an answer to my own question:
Note
...
It's because, despite the Pattern Expression looking like a Hash, it can't use the old:role => "admin"Hash rocket - right?May I ask why?
Was there a design decision made here on purpose, or by accident, or was it just giving the parser too much of an identity crisis to attempt to parse "hash" rockets right alongside the "rightward assignment" rockets?
I've found:
-
https://docs.ruby-lang.org/en/3.0/syntax/pattern_matching_rdoc.html
Note that only symbol keys are supported for hash patterns.
-
@baweaver (Brandon Weaver) said in rubytalk.org.
String keys do not work with pattern matching, or more specifically pattern expressions in this case. The only way to get around that is to use something to symbolize the keys.
-
@pitr.ch (Petr Chalupa) said in #14912
This was already mentioned as one of the problems to be looked at in future in the RubyKaigi's talk. If => is taken for as pattern then it cannot be used to match hashes with non-Symbol keys.
-
@ktsj (Kazuki Tsujimoto) listed https://speakerdeck.com/k_tsj/pattern-matching-new-feature-in-ruby-2-dot-7?slide=58
"Non-symbol keys for hash patterns" (as Future Work)
... but it looks to me that it's just because the "rightward assignment rocket" locks up access to the "hash rocket" within a Hash Pattern.
(ie. because they both can't use => within the same context.)
Updated by baweaver (Brandon Weaver) 1 day ago
I'd like to comment on the broader goal here, rather than the specific syntax, to start. I've written about this subject before in The Case for Pattern Matching Key Irreverence in Ruby and I think that the goal of matching against non-Symbol keys in pattern matching is a missing feature.
That said I do have concerns with the proposed syntax.
The { a : 1 } vs { a: 1 } Distinction Is Confusing¶
I'll be direct: disambiguating semantics based on whitespace around a colon is going to be a source of bugs and confusion.
These are visually near-identical and semantically opposite. Code formatters, copy-paste, and a single keystroke can silently change the meaning of your program. nobu's comment #28 ("These look very confusing") captures my reaction as well.
Ruby has a history of being generous with whitespace, so to encode meaning into spacing would be a deviation from the language design, I feel. It optimizes more for syntactic completeness at the cost of readability.
But the Goal Is Right: Pattern Matching Needs String Key Support¶
Where I strongly agree is with the underlying motivation, particularly what yertto demonstrates in comment #32:
case JSON.parse(data)
in {"admin" : {"first name" : String => name, "privileges" : {"write" : true}}}
puts "#{name} is an admin with write access"
end
Existing pattern matching cannot capture String-keyed hashes without deep_symbolize_keys or similar methods, and that feels like an ergonomic miss. JSON, CSV, YAML, HTML headers, and other concepts especially around serialization arrive with String keys, making them incompatible with pattern matching as-is.
The problem we have is that hash rockets have been overloaded to mean both key => value and value => variable, blocking us from leveraging this for String keys. Kazuki Tsujimoto acknowledged this in his RubyKaigi talk, listing "Non-symbol keys for hash patterns" as explicit future work.
There Is Deep Precedent for Key Irreverence¶
Ruby has a long tradition of being irreverent about the Symbol/String distinction when intent is clear:
send,define_method,instance_variable_get,const_get: all accept either Strings or Symbols because Ruby prioritizes "knowing what you meant."- Keyword arguments: evolved from
Hash<Symbol, Any>coercion into their own thing. - Punning (
{ x:, y: }): the Symbol functions as a method call, not a literal. Hash#with_indifferent_accessin Rails: the most popular framework in the ecosystem decided this distinction shouldn't matter. I'm inclined to agree with them.
As Polished Ruby Programming puts it: "when Ruby needs a symbol, it will often accept a string and convert it for the programmer's convenience." It is established Ruby precedent to favor convenience over explicitness.
Pattern matching's deconstruct_keys already compromises on Symbol literalism. When you match against a Point:
The :x passed here to deconstruct_keys does not represent a Symbol hash key, it represents an instance variable or a method call. The Symbol is functioning as a query parameter, not a type-literal key. Pattern matching is a query language against available fields, not a Hash-to-Hash comparison.
What I'd Prefer to See¶
Rather than whitespace-sensitive disambiguation, I'd prefer one or both of:
-
The
(expr): valuesyntax from Feature #22108: -
A semantic convention that
deconstruct_keysshould be key-irreverent: implementations coerce String↔Symbol as appropriate for the object's internal representation. This is the approach I advocated in my original post and in discussions on CSV#246. ForHashspecifically, try the Symbol key first, fall back to the String variant. The performance cost is potentially acceptable and could be JIT'd or optimized:
# Don't do this in production code
# So we have a "Ruby" implementation to level against, rather than the C
# one.
class HashOriginal < Hash
def deconstruct_keys(keys)
return self unless keys
keys.each_with_object({}) do |key, matches|
matches[key] = self[key] if key?(key)
end
end
end
class HashPrime < Hash
def deconstruct_keys(keys)
if keys.nil?
self.transform_keys(&:to_sym)
else
keys.each_with_object({}) do |key, matches|
if key?(key)
matches[key] = self[key]
elsif key?(key.to_s)
matches[key] = self[key.to_s]
end
end
end
end
end
Benchmark.ips do |x|
x.report("Hash") do
Hash[a: 1, b: 2] in { a: 1, b: 2, c: 3 }
end
x.report("HashOriginal") do
HashOriginal[a: 1, b: 2] in { a: 1, b: 2, c: 3 }
end
x.report("HashPrime") do
HashPrime[a: 1, b: 2] in { a: 1, b: 2, c: 3 }
end
x.report("HashPrime String") do
HashPrime[a: 1, "b" => 2] in { a: 1, b: 2, c: 3 }
end
end
# Warming up --------------------------------------
# Hash 299.651k i/100ms
# HashOriginal 103.085k i/100ms
# HashPrime 85.180k i/100ms
# HashPrime String 80.537k i/100ms
# Calculating -------------------------------------
# Hash 2.951M (± 3.0%) i/s - 14.983M in 5.081797s
# HashOriginal 1.057M (± 2.6%) i/s - 5.360M in 5.075552s
# HashPrime 924.234k (± 3.8%) i/s - 4.685M in 5.076900s
# HashPrime String 784.882k (± 4.9%) i/s - 3.946M in 5.041825s
Option 1 solves the parser problem without introducing whitespace ambiguity. Option 2 solves it at the semantic layer for objects like CSV::Row, MatchData, and custom classes. Ideally, both.
The Historical Arc Supports This¶
The colon family has been expanding for almost 20 years:
| Version | Feature |
|---|---|
| 1.9 | foo: value: Symbol key shorthand |
| 2.2 | "quoted": value: Quoted symbol labels |
| 3.1 | name:: Value omission (punning) |
Each one was met with resistance when added, but each grew on the Ruby community over time. I believe it's time to consider completing this arc for non-Symbol keys, but I want to make sure to do it in a way that does not give distinct meaning to whitespace that could cause errors and confusion.
Updated by yertto (_ yertto) 1 day ago
Thanks @baweaver (Brandon Weaver).
(This proposal is not just to free up String keys, but - yes - there must be a considerable amount of code and authors that have been, and will continue to be, affected just by this lack of support for String keys in pattern matching.)
I'll be direct: disambiguating semantics based on whitespace around a colon is going to be a source of bugs and confusion.
One could argue that a similar whitespace problem already occurs in the language when trying to figure out why a is a variable and :a is a symbol.
However, after talking to a number of Ruby "veterans" about this...
Yes. I concede that the { a : 1 } vs { a: 1 } confusion presents just too much of a mental leap for them to make at this stage.
So perhaps:
{ (expr): value }"hash capsule" syntax (reference implementation, feature #22108), would be a better choice than,{ expr : value }"hash colon" syntax (reference implementation, feature #22111)
The thing is though, that the hash colon implementation (which essentially just allows a : to be used as an alternative to =>) is a lot simpler, a lot less code AND supports a superset of key syntaxes - including the use of hash capsules - out of the box.
So my thinking is that while the Ruby language could support hash colon syntax in full, the feature and all its documentation could be released in a way that only mentions the introduction of the hash capsule syntax (...): as an alternative to the hash rocket => for non-symbolic keys.
Then, when the community (and rubocop) see hash capsules' redundancy, parentheses could slowly fade and hash colon syntax can "just work" (as it always did).
(ie. similar to how many users may have initially used parentheses in their calls to some_method(), but eventually felt comfortable enough just calling some_method instead.)
The colon family has been expanding for almost 20 years:
...
Each one was met with resistance when added, but each grew on the Ruby community over time. I believe it's time to consider completing this arc for non-Symbol keys, but I want to make sure to do it in a way that does not give distinct meaning to whitespace that could cause errors and confusion.
What would you think of the hash colon syntax being slowly phased in like so:
Hash Colon Syntax phase in...¶
| Current Syntax | Hash Colon (using Hash Capsules on all non-symbolic keys) |
Hash Colon (using Hash Capsules only where necessary to avoid whitespace ambiguity) |
Hash Colon (with ?acceptable? whitespace ambiguity) |
|---|---|---|---|
{ key1: 1 } |
{ key1: 1 } |
{ key1: 1 } |
{ key1: 1 } |
{ "key-2": 2 } |
{ "key-2": 2 } |
{ "key-2": 2 } |
{ "key-2": 2 } |
{ key3 => 3 } |
{ (key3): 3 } |
{ (key3): 3 } |
{ key3 : 3 } |
{ "key4" => 4 } |
{ ("key4"): 4 } |
{ ("key4"): 4 } |
{ "key4" : 4 } |
{ 5+0 => 5 } |
{ (5+0): 5 } |
{ 5+0: 5 } |
{ 5+0: 5 } |
{ 6 => 6 } |
{ (6): 6 } |
{ 6: 6 } |
{ 6: 6 } |
{ 7.001 => 7 } |
{ (7.001): 7 } |
{ 7.001: 7 } |
{ 7.001: 7 } |
{ "key" + "8" => 8 } |
{ ("key" + "8"): 8 } |
{ "key" + "8": 8 } |
{ "key" + "8": 8 } |
{ 9+0 => 9 } |
{ (9+0): 9 } |
{ 9+0: 9 } |
{ 9+0: 9 } |
{ [10, 0] => 10 } |
{ ([10, 0]): 10 } |
{ [10, 0]: 10 } |
{ [10, 0]: 10 } |
Where it could be left to the community to one day decide if that final phase (with its whitespace ambiguity) is acceptable or not.
Is that a workable compromise @nobu (Nobuyoshi Nakada) / @shyouhei (Shyouhei Urabe) ?
Updated by austin (Austin Ziegler) 1 day ago
yertto (_ yertto) wrote in #note-35:
However, after talking to a number of Ruby "veterans" about this...
Yes. I concede that the{ a : 1 }vs{ a: 1 }confusion presents just too much of a mental leap for them to make at this stage.So perhaps:
{ (expr): value }"hash capsule" syntax (reference implementation, feature #22108), would be a better choice than,{ expr : value }"hash colon" syntax (reference implementation, feature #22111)
I don't like { expr : value } because I think that it is too confusable and prone to typos, but right now { a : 3 } is a syntax error (at least under prism).
irb(main):001> { a : 3 }
/Users/austin/.local/share/mise/installs/ruby/4.0.5/lib/ruby/gems/4.0.0/gems/irb-1.16.0/exe/irb:9:in '<top (required)>': (irb):1: syntax errors found (SyntaxError)
> 1 | { a : 3 }
| ^ expected a `}` to close the hash literal
| ^ expected a `=>` between the hash key and value
| ^ unexpected ':', ignoring it
| ^ unexpected ':', expecting end-of-input
| ^ unexpected ':'; expected a value in the hash literal
| ^ unexpected '}', ignoring it
| ^ unexpected '}', expecting end-of-input
So from a parsing perspective, this could probably work. But would { a : 3 } raising NameError be better than the syntax error we have now just because one typed { a : 3 } meaning to type { a: 3 }? If expr is a function or variable, I think that there's too much chance of the wrong thing being performed when constructing a hash.
I don't think that the key3 ambiguity will ever be acceptable in the examples you provided above, but I think that if an expr is unambiguous, the parentheses could be skipped. Ambiguous values — values which could be bare symbols or string-indicated symbols — should never be allowed.
I think the appropriate stopping point is at column 3. and never at column 4 of your table — there's little benefit to it over the third step and many downsides, including high levels of confusability resulting in many footguns that will reduce the overall security stance of the Ruby language.