Project

General

Profile

Feature #15563

#dig that throws an exception if an key doesn't exist

Added by 3limin4t0r (Johan Wentholt) 25 days ago. Updated 22 days ago.

Status:
Open
Priority:
Normal
Assignee:
-
Target version:
-
[ruby-core:91265]

Description

Ruby 2.3.0 introduced #dig for Array, Hash and Struct. Both Array and Hash have #fetch which does the same as #[], but instead of returning the default value an exception is raised (unless a second argument or block is given). Hash also has #fetch_values which does the same as #values_at, raising an exception if an key is missing. For #dig there is no such option.

My proposal is to add a method which does the same as #dig, but instead of using the #[] accessor it uses #fetch.

This method would look something like this:

module DigWithException
  def dig_e(key, *others)
    value = fetch(key)
    return value if value.nil? || others.empty?

    if value.respond_to?(__method__, true)
      value.send(__method__, *others)
    else
      raise TypeError, "#{value.class} does not have ##{__method__} method"
    end
  end
end

Array.include(DigWithException)
Hash.include(DigWithException)

The exception raised is also taken from #dig ([1].dig(0, 1) #=> TypeError: Integer does not have #dig method). I personally have my issues with the name #dig_e, but I haven't found a better name yet.

There are also a few other things that I haven't thought out yet.

  1. Should this method be able to accept a block which, will be passed to the #fetch call and recursive #dig_e calls?

    module DigWithException
      def dig_e(key, *others, &block)
        value = fetch(key, &block)
        return value if value.nil? || others.empty?
    
        if value.respond_to?(__method__, true)
          value.send(__method__, *others, &block)
        else
          raise TypeError, "#{value.class} does not have ##{__method__} method"
        end
      end
    end
    
    Array.include(DigWithException)
    Hash.include(DigWithException)
    
  2. I currently kept the code compatible with the #dig description.

    Extracts the nested value specified by the sequence of key objects by calling dig at each step, returning nil if any intermediate step is nil.

    However, with this new version of the method one could consider dropping the "returning nil if any intermediate step is nil" part, since this would be the more strict version.

    module DigWithException
      def dig_e(key, *others)
        value = fetch(key)
        return value if others.empty?
    
        if value.respond_to?(__method__, true)
          value.send(__method__, *others)
        else
          raise TypeError, "#{value.class} does not have ##{__method__} method"
        end
      end
    end
    
    Array.include(DigWithException)
    Hash.include(DigWithException)
    

I'm curious to hear what you guys think about the idea as a whole, the method name and the two points described above.


Related issues

Is duplicate of Ruby trunk - Feature #12282: Hash#dig! for repeated applications of Hash#fetchOpenActions

History

Updated by 3limin4t0r (Johan Wentholt) 25 days ago

  • Description updated (diff)

I just discovered that #dig also call private methods. I updated the provided examples to do the same.

hash = { b: 'b' }
hash.singleton_class.send(:private, :dig)
{ a: hash }.dig(:a, :b)
#=> 'b'

Updated by shevegen (Robert A. Heiler) 25 days ago

I have no particular pro or con against the feature itself as such; I myself do not use or need .dig so I
can not speak much about it. But I believe one problem with the proposal here is the name.

I think a name such as "dig_e" would be very, very rare to see in ruby. Of course I have no idea how
matz thinks about it, but I would recommend to you to also consider alternative names; or perhaps
let it handle just through arguments, whatever may seem to fit better.

Short names are sometimes really, really great, such as p and pp; but I think one overall concern may
be to not lose too much of the meaning. Off the top of my head, I can only think of FileUtils having
odd/very short method names, and this is mostly because it sort of "simulates" how coreutils utilities
such as "mkdir -p" and similar work.

If you look at recent changes in ruby, you may notice the :exception key - :e would be shorter than
that too, but I think it may not be a primary goal at all times to be too overly succinct, so if that is
a valid reasoning then I think this may explain why :exception would be used, and no shorter
variant. A similar reasoning could apply to the case here - but again, ultimately you have to see what
matz thinks about it not how others may think about it. :)

Updated by jwmittag (Jörg W Mittag) 24 days ago

shevegen (Robert A. Heiler) wrote:

I have no particular pro or con against the feature itself as such; I myself do not use or need .dig so I
can not speak much about it. But I believe one problem with the proposal here is the name.

I think a name such as "dig_e" would be very, very rare to see in ruby. Of course I have no idea how
matz thinks about it, but I would recommend to you to also consider alternative names; or perhaps
let it handle just through arguments, whatever may seem to fit better.

There is a well-established convention in Ruby, when you have a pair of methods that does similar things in different ways, to name them foo and foo!. For example, select and select!, Process::exit and Process::exit!, and so on.

So, one possibility would be dig!.

Updated by matz (Yukihiro Matsumoto) 24 days ago

I am against dig! for this purpose. When we have two versions of a method (foo and foo!), the bang version should be more dangerous than the non-bang version. dig! is not the case.

And with whatever name, we need the real-world use-case for a new method. "We don't have fetch counterpart of dig" is not good enough.

Matz.

#5

Updated by k0kubun (Takashi Kokubun) 24 days ago

  • Is duplicate of Feature #12282: Hash#dig! for repeated applications of Hash#fetch added

Updated by k0kubun (Takashi Kokubun) 24 days ago

Personally I've hit a real-world use-case of this feature many times.

I often manage structured configs with nested YAML files and load them from Ruby. With current Ruby, to avoid an unhelpful exception NoMethodError, I assert the existence of the deep keys using a Hash#fetch chain like this:

config = YAML.load_file('config.yml')
config.fetch('production').fetch('environment').fetch('SECRET_KEY_BASE') #=> an exception like: KeyError: key not found: "SECRET_KEY_BASE"

If we had such a method, we would be able to write (let's say it's named Hash#fetch_keys instead of #dig!):

config.fetch_keys('production', 'environment', 'SECRET_KEY_BASE')

and the best part is that we could get a more helpful error message like "key not found: production.environment.SECRET_KEY_BASE" whose nested information isn't available with Hash#fetch method chains.


By the way, if we had this, I would like to have a keyword argument default: like the second optional argument of Hash#fetch:

env = 'production' # can be 'staging', 'development'
config.fetch_keys(env, 'environment', 'SECRET_KEY_BASE', default: '002bbfb0a35d0fd05b136ab6333dc459')

we want to safely manage the credentials only for production, so sometimes we don't want to manage credentials in (safely-managed originally-encrypted) YAML file for development environment and just want to return the unsafe thing as a default value.

Updated by 3limin4t0r (Johan Wentholt) 22 days ago

My scenario would be similar to k0kubuns scenario.

# The connection translates the request to JSON and parses the response
# from JSON into the correct objects. In this case a nested hash structure.
response = connection.send(request)

# assign shortcuts
report = response
         .fetch('Era.Common.NetworkMessage.ConsoleApi.Reports.RpcGenerateReportResponse')
         .fetch('report')

column_data   = report.fetch('data').fetch('columns')
column_labels = report.fetch('rendering').fetch('table').fetch('columns')

# build report
report_data = column_data.each_with_object({}) do |column, data|
  column_id       = column.fetch('header').fetch('column_id')
  data[column_id] = column.fetch('values')
end

report = column_labels.each_with_object({}) do |column, data|
  label       = column.fetch('label').fetch('literal')
  column_id   = column.fetch('column_id')
  data[label] = report_data.fetch(column_id)
end

From the above scenario you can see that having this new functionality would help clean things up.

The reason I use #fetch here is because the API to which I'm talking might change its structure. Getting an error as soon as possible reduces debug time. If #dig where used, nil would be returned when the structure is invalid. This would most of the time raise an exception somewhere else that then needs to be traced back to its source (the changed response structure).

My preference goes out to dropping the "returning nil if any intermediate step is nil" description (as described in point 2 of the feature proposal). Otherwise, when a key is present but the value is set to nil it will short circuit out of the method. Dropping this part of the #dig description would ensure the full path is traversed.

I also had a look at the linked feature proposal. I find the name #deep_fetch the most descriptive. #fetch_keys sounds like it will fetch multiple keys on a single hash (basically what #fetch_values does). #fetch_all suffers from the same problem. If the eventual version always traverses the full path (see point 2 of the feature proposal) #traverse could be an option.

Also available in: Atom PDF