Feature #15563
open#dig that throws an exception if a key doesn't exist
Added by 3limin4t0r (Johan Wentholt) almost 6 years ago. Updated almost 5 years ago.
Description
Ruby 2.3.0 introduced #dig
for Array
, Hash
and Struct
. Both Array
and Hash
have #fetch
, which works the same way as #[]
, but instead of returning the default value, raises an exception (unless a second argument or block is given). Hash
also has #fetch_values
, which works the same way as #values_at
, raising an exception if a key is missing. For #dig
, there is no such option.
My proposal is to add a method that works the same way as #dig
, but instead of using the #[]
accessor, uses #fetch
.
This method would look something like this:
module Traversable
def traverse(key, *others)
value = fetch(key)
return value if others.empty?
if value.respond_to?(__method__, true)
value.send(__method__, *others)
else
raise TypeError, "#{value.class} does not have ##{__method__} method"
end
end
end
Array.include(Traversable)
Hash.include(Traversable)
The exception raised is taken from #dig
([1].dig(0, 1) #=> TypeError: Integer does not have #dig method
).
yaml = YAML.load_file('some_file.yml')
# this change is meant to make chaining #fetch calls more easy
yaml.fetch('foo').fetch('bar').fetch('baz')
# would be replaced with
yaml.traverse('foo', 'bar', 'baz')
I'm curious to hear what you guys think about the idea as a whole and the method name.
Updated by 3limin4t0r (Johan Wentholt) almost 6 years ago
- Description updated (diff)
I just discovered that #dig
also call private methods. I updated the provided examples to do the same.
hash = { b: 'b' }
hash.singleton_class.send(:private, :dig)
{ a: hash }.dig(:a, :b)
#=> 'b'
Updated by shevegen (Robert A. Heiler) almost 6 years ago
I have no particular pro or con against the feature itself as such; I myself do not use or need .dig so I
can not speak much about it. But I believe one problem with the proposal here is the name.
I think a name such as "dig_e" would be very, very rare to see in ruby. Of course I have no idea how
matz thinks about it, but I would recommend to you to also consider alternative names; or perhaps
let it handle just through arguments, whatever may seem to fit better.
Short names are sometimes really, really great, such as p and pp; but I think one overall concern may
be to not lose too much of the meaning. Off the top of my head, I can only think of FileUtils having
odd/very short method names, and this is mostly because it sort of "simulates" how coreutils utilities
such as "mkdir -p" and similar work.
If you look at recent changes in ruby, you may notice the :exception key - :e would be shorter than
that too, but I think it may not be a primary goal at all times to be too overly succinct, so if that is
a valid reasoning then I think this may explain why :exception would be used, and no shorter
variant. A similar reasoning could apply to the case here - but again, ultimately you have to see what
matz thinks about it not how others may think about it. :)
Updated by jwmittag (Jörg W Mittag) almost 6 years ago
shevegen (Robert A. Heiler) wrote:
I have no particular pro or con against the feature itself as such; I myself do not use or need .dig so I
can not speak much about it. But I believe one problem with the proposal here is the name.I think a name such as "dig_e" would be very, very rare to see in ruby. Of course I have no idea how
matz thinks about it, but I would recommend to you to also consider alternative names; or perhaps
let it handle just through arguments, whatever may seem to fit better.
There is a well-established convention in Ruby, when you have a pair of methods that does similar things in different ways, to name them foo
and foo!
. For example, select
and select!
, Process::exit
and Process::exit!
, and so on.
So, one possibility would be dig!
.
Updated by matz (Yukihiro Matsumoto) almost 6 years ago
I am against dig!
for this purpose. When we have two versions of a method (foo
and foo!
), the bang version should be more dangerous than the non-bang version. dig!
is not the case.
And with whatever name, we need the real-world use-case for a new method. "We don't have fetch
counterpart of dig
" is not good enough.
Matz.
Updated by k0kubun (Takashi Kokubun) almost 6 years ago
- Is duplicate of Feature #12282: Hash#dig! for repeated applications of Hash#fetch added
Updated by k0kubun (Takashi Kokubun) almost 6 years ago
Personally I've hit a real-world use-case of this feature many times.
I often manage structured configs with nested YAML files and load them from Ruby. With current Ruby, to avoid an unhelpful exception NoMethodError
, I assert the existence of the deep keys using a Hash#fetch
chain like this:
config = YAML.load_file('config.yml')
config.fetch('production').fetch('environment').fetch('SECRET_KEY_BASE') #=> an exception like: KeyError: key not found: "SECRET_KEY_BASE"
If we had such a method, we would be able to easily write (let's say it's named Hash#deep_fetch
instead of #dig!
):
config.deep_fetch('production', 'environment', 'SECRET_KEY_BASE')
and the best part is that we could get a more helpful error message like "key not found: production.environment.SECRET_KEY_BASE
" whose nested information isn't available with a Hash#fetch
method chain.
By the way, if we had this, I would like to have a keyword argument default:
like the second optional argument of Hash#fetch
:
config.deep_fetch('production', 'environment', 'SECRET_KEY_BASE', default: '002bbfb0a35d0fd05b136ab6333dc459')
we want to safely manage the credentials only for production, so sometimes we don't want to manage credentials in (safely-managed originally-encrypted) YAML file for development environment and just want to return the unsafe thing as a default value.
(edit: s/fetch_keys/deep_fetch/
, like I proposed 2 years ago)
Updated by 3limin4t0r (Johan Wentholt) almost 6 years ago
My scenario would be similar to k0kubuns scenario.
# The connection translates the request to JSON and parses the response
# from JSON into the correct objects. In this case a nested hash structure.
response = connection.send(request)
# assign shortcuts
report = response
.fetch('Era.Common.NetworkMessage.ConsoleApi.Reports.RpcGenerateReportResponse')
.fetch('report')
column_data = report.fetch('data').fetch('columns')
column_labels = report.fetch('rendering').fetch('table').fetch('columns')
# build report
report_data = column_data.each_with_object({}) do |column, data|
column_id = column.fetch('header').fetch('column_id')
data[column_id] = column.fetch('values')
end
report = column_labels.each_with_object({}) do |column, data|
label = column.fetch('label').fetch('literal')
column_id = column.fetch('column_id')
data[label] = report_data.fetch(column_id)
end
From the above scenario you can see that having this new functionality would help clean things up.
The reason I use #fetch
here is because the API to which I'm talking might change its structure. Getting an error as soon as possible reduces debug time. If #dig
where used, nil
would be returned when the structure is invalid. This would most of the time raise an exception somewhere else that then needs to be traced back to its source (the changed response structure).
My preference goes out to dropping the "returning nil
if any intermediate step is nil
" description (as described in point 2 of the feature proposal). Otherwise, when a key is present but the value is set to nil
it will short circuit out of the method. Dropping this part of the #dig
description would ensure the full path is traversed.
I also had a look at the linked feature proposal. I find the name #deep_fetch
the most descriptive. #fetch_keys
sounds like it will fetch multiple keys on a single hash (basically what #fetch_values
does). #fetch_all
suffers from the same problem. If the eventual version always traverses the full path (see point 2 of the feature proposal) #traverse
could be an option.
Updated by walerian (Walerian Sobczak) over 5 years ago
I would suggest #retrieve
. It's just a stronger #fetch
, and the dictionary definition reflects its meaning:
retrieve (verb)
- get or bring (something) back from somewhere
- find or extract
The name is still short and simple, but also idiomatic and meaningful at the same time.
config = YAML.load_file('config.yml')
# so instead of this:
config.fetch('production').fetch('environment').fetch('SECRET_KEY_BASE')
# we would have:
config.retrieve('production', 'environment', 'SECRET_KEY_BASE')
Updated by k0kubun (Takashi Kokubun) over 5 years ago
- Is duplicate of Feature #14602: Version of dig that raises error if a key is not present added
Updated by knu (Akinori MUSHA) over 5 years ago
I thought adding an optional block to dig could be an idea.
obj.dig(*keys) { |dug_keys, rest_keys|
# raise yourself or return something
}
But the costs required for extending the dig method as such in Array or Hash-like classes would be a bit too high.
Updated by 3limin4t0r (Johan Wentholt) about 5 years ago
- Description updated (diff)
Changed naming from #dig_e
to #traverse
. Removed alternative options, and kept the strict version since #fetch
is also strict. Possibility to expand upon this with a block or :default
keyword argument is possible, but should be its own feature request.
Updated by 3limin4t0r (Johan Wentholt) about 5 years ago
- Description updated (diff)
Removed the two discussion points in the previous edit. This edit adjusts the last sentence to reflect this.
Updated by amcaplan (Ariel Caplan) almost 5 years ago
matz (Yukihiro Matsumoto) wrote:
I am against
dig!
for this purpose. When we have two versions of a method (foo
andfoo!
), the bang version should be more dangerous than the non-bang version.dig!
is not the case.And with whatever name, we need the real-world use-case for a new method. "We don't have
fetch
counterpart ofdig
" is not good enough.Matz.
I think that I, along with others, have gotten used to bang methods being subject to more errors rather than mutating state, due to how ActiveSupport uses them. Ruby is not Rails, though, so this is a good reminder :)
I like #deep_fetch
, which also happens to be a gem that already does this (https://rubygems.org/gems/deep_fetch). It was designed for implicitly validating that API formats have not changed, which says something about the need for this. I think having an easy way to dig
while doing that constant validation would encourage writing code that notifies us right away when our assumptions are wrong.
As an alternative, maybe #fetch_strict
or (though I like this less) a named parameter passed to dig
like my_array.dig(:foo, :bar, :baz, strict: true)
. The latter option currently isn't possible since it's unclear whether strict: true
is a key or a named parameter, but that problem will go away in Ruby 3.
Updated by robb (Robb Shecter) almost 5 years ago
amcaplan (Ariel Caplan) wrote in #note-13:
matz (Yukihiro Matsumoto) wrote:
[...] with whatever name, we need the real-world use-case for a new method. "We don't have
fetch
counterpart ofdig
" is not good enough.[...] implicitly validating that API formats have not changed...
I have a similar real-world use-case: I use chained #fetch
s when importing JSON: My code needs the JSON to have the correct attributes and structure, and when it's tedious to explicitly check and handle every potential missing attribute. I can either: let my import script just crash with the clear exception message, or choose to handle exceptions in just one place. (Handling nil
s is more complex and less clear.). From my production app:
# @param from_json [Hash]
# @return [nil]
def import_objects(from_json:)
import_chapter from_json.fetch('statuteTree').fetch('chapter0')
import_titles from_json.fetch('statuteTree').fetch('titles')
end
# ...
# 3. Decide: Are my children Sections or Sub-chapters? Then,
# import them.
child_nodes = chapter_json.fetch('content').fetch('contents')
# ...
# @param nrs_json [Hash]
# @return [Boolean]
def simple_subchapter_content?(nrs_json)
nrs_json.fetch('children').fetch('tag') == 'SubChapterSections'
end
# @param nrs_json [Hash]
# @return [Boolean]
def simple_chapter_content?(nrs_json)
nrs_json.fetch('content').fetch('tag') == 'SimpleChapterContent'
end
I wrote this gem for chained #fetch
s on Hashes and Arrays and has a very clean one-line implementation using #reduce
: https://github.com/dogweather/digbang to make it easier me for the do the above. Caveat, it uses the Rails-like #dig!
naming, but this shows the real-world use-case for a checked #dig
.
Updated by robb (Robb Shecter) almost 5 years ago
matz (Yukihiro Matsumoto) wrote in #note-4:
And with whatever name, we need the real-world use-case for a new method.
Here is another real-world use-case, from my public.law apps which parse JSON:
def insert_subparts(for_id:, with_map:)
sub_documents = JSON.parse(File.read("#{for_id}.json")).fetch('result').fetch('documents').fetch('documents').fetch('items')
insert_sub_documents from_tree: sub_documents,
under_parent: with_map.fetch(for_id)
end
I'm sure you can see why I've created syntactic/semantic sugar for all these #fetch
invocations.
Updated by sawa (Tsuyoshi Sawada) almost 5 years ago
- Subject changed from #dig that throws an exception if an key doesn't exist to #dig that throws an exception if a key doesn't exist
- Description updated (diff)