Feature #19272: Hash#merge: smarter protocol depending on passed block arity - Ruby - Ruby Issue Tracking System

Actions

Copy link

Feature #19272

closed

Hash#merge: smarter protocol depending on passed block arity

Added by zverok (Victor Shepelev) over 2 years ago. Updated over 2 years ago.

Status:

Rejected

Assignee:

Target version:

[ruby-core:111461]

Description

Usage of Hash#merge with a "conflict resolution block" is almost always clumsy: due to the fact that the block accepts |key, old_val, new_val| arguments, and many trivial usages just somehow sum up old and new keys, the thing that should be "intuitively trivial" becomes longer than it should be:

# I just want a sum!
{apples: 1, oranges: 2}.merge(apples: 3, bananas: 5) { |_, o, n| o + n }

# I just want a group!
{words: %w[I just]}.merge(words: %w[want a group]) { |_, o, n| [*o, *n] }

# I just want to unify flags!
{'file1' => File::READABLE, 'file2' => File::READABLE | File::WRITABLE}
  .merge('file1' => File::WRITABLE) { |_, o, n| o | n }

# ...or, vice versa:
{'file1' => File::READABLE, 'file2' => File::READABLE | File::WRITABLE}
  .merge('file1' => File::WRITABLE, 'file2' => File::WRITABLE) { |_, o, n| o & n }

It is especially noticeable in the last two examples, but the usual problem is there are too many "unnecessary" punctuation, where the essential might be lost.

There are proposals like #19148, which struggle to define another method (what would be the name? isn't it just merging?)

But I've been thinking, can't the implementation be chosen based on the arity of the passed block?.. Prototype:

class Hash
  alias old_merge merge

  def merge(other, &block)
    return old_merge(other) unless block
    if block.arity.abs == 2
      old_merge(other) { |_, o, n| block.call(o, n) }
    else
      old_merge(other, &block)
    end
  end
end

E.g.: If, and only if, the passed block is of arity 2, treat it as an operation on old and new values. Otherwise, proceed as before (maintaining backward compatibility.)

Usage:

{apples: 1, oranges: 2}.merge(apples: 3, bananas: 5, &:+)
#=> {:apples=>4, :oranges=>2, :bananas=>5}

{words: %w[I just]}.merge(words: %w[want a group], &:concat)
#=> {:words=>["I", "just", "want", "a", "group"]}

{'file1' => File::READABLE, 'file2' => File::READABLE | File::WRITABLE}
  .merge('file1' => File::WRITABLE, &:|)
#=> {"file1"=>5, "file2"=>5}

{'file1' => File::READABLE, 'file2' => File::READABLE | File::WRITABLE}
  .merge('file1' => File::WRITABLE, 'file2' => File::WRITABLE, &:&)
#=> {"file1"=>0, "file2"=>4}

# If necessary, the old protocol still works:
{apples: 1, oranges: 2}.merge(apples: 3, bananas: 5) { |k, o, n| k == :apples ? 0 : o + n }
# => {:apples=>0, :oranges=>2, :bananas=>5}

As far as I can remember, Ruby core doesn't have methods like this (that change implementation depending on the arity of passed callable), but I think I saw this approach in other languages. Can't remember particular examples, but always found this idea appealing.

Actions

Copy link

Updated by zverok (Victor Shepelev) over 2 years ago

Description updated (diff)

Actions

Copy link

Updated by zverok (Victor Shepelev) over 2 years ago

Description updated (diff)

Actions

Copy link

Updated by zverok (Victor Shepelev) over 2 years ago

Description updated (diff)

Actions

Copy link

#4 [ruby-core:111470]

Updated by sawa (Tsuyoshi Sawada) over 2 years ago

Using numbered parameters, we can do slightly better:

{apples: 1, oranges: 2}.merge({apples: 3, bananas: 5}){_2 + _3}

although I am neutral about the proposal.

Actions

Copy link

#5 [ruby-core:111474]

Updated by zverok (Victor Shepelev) over 2 years ago

@sawa I didn't mention the solution with numeric arguments because I believe it to be even more cryptic than with named ones.

The reader needs to remember at all times what's the protocol of merge block (merge with a block is not used every day, so it is not a given) and what was that first argument that we are ignoring.

With named arguments, we can at least give a hint (in some codebases, I use _k, o, n, which is more like "note to self", in others, I prefer _key, oldval, newval or something like that).

Actions

Copy link

#6 [ruby-core:112241]

Updated by nobu (Nobuyoshi Nakada) over 2 years ago

zverok (Victor Shepelev) wrote:

E.g.: If, and only if, the passed block is of arity 2, treat it as an operation on old and new values. Otherwise, proceed as before (maintaining backward compatibility.)

Usage:
{apples: 1, oranges: 2}.merge(apples: 3, bananas: 5, &:+)
#=> {:apples=>4, :oranges=>2, :bananas=>5}

:+.to_proc is a proc just calls + method on the first argument with the rest.
That means its arity is not deterministic.

{words: %w[I just]}.merge(words: %w[want a group], &:concat)
#=> {:words=>["I", "just", "want", "a", "group"]}

In this example, you expect Array#concat on the old values, but the arity of Array#concat is -1 not 2.

Actions

Copy link

#7 [ruby-core:112261]

Updated by zverok (Victor Shepelev) over 2 years ago

@nobu (Nobuyoshi Nakada) All of my examples work with my reference implementation. You can try it yourself.

:any_symbol.to_proc.arity is -2, corresponding to the following lambda:

->(first, *rest) { first.send(symbol, *rest) }

The behavior is corresponding, too:

def fake_to_proc(symbol) = ->(first, *rest) { first.send(symbol, *rest) }

:+.to_proc.arity #=> -2
fake_to_proc(:+).arity #=> -2

:+.to_proc.parameters       #=> [[:req], [:rest]]
fake_to_proc(:+).parameters #=> [[:req, :first], [:rest, :rest]]

:+.to_proc.call(1)
# `+': wrong number of arguments (given 0, expected 1) (ArgumentError) -- on handling +, not calling the lambda
fake_to_proc(:+).call(1)
# `+': wrong number of arguments (given 0, expected 1) (ArgumentError)

:+.to_proc.call(1, 2)       #=> 3
fake_to_proc(:+).call(1, 2) #=> 3

Therefore:

Any :+.to_proc.arity is -2
Which is not a bug/accident, but a proper reporting of arity/parameters
Which actually made me think about this idea with merge :)
Which works with the reference implementation.

Actions

Copy link

#8 [ruby-core:112265]

Updated by nobu (Nobuyoshi Nakada) over 2 years ago

zverok (Victor Shepelev) wrote in #note-7:

Any :+.to_proc.arity is -2

Which is not a bug/accident, but a proper reporting of arity/parameters

That -2 means just unlimited.

Which actually made me think about this idea with merge :)

.abs == 2? 😅

Actions

Copy link

#9 [ruby-core:112266]

Updated by zverok (Victor Shepelev) over 2 years ago

That -2 means just unlimited.

Well, it is obviously not my call to decide what it means, but I interpret it as "2 explicitly declared params (plus some unpacking probably happening)". I mean, it is not exactly the same as -1 or -3, right?..

So I believe it is a good enough heuristic for this case because when somebody provides an old-style block, its arity would be:

proc { |key, oldval, newval| }.arity #=> 3

E.g. not 2 or -2 definitely.

So, yeah, arity.abs == 2 is a lousy heuristic, but my estimation is it should be enough to provide reasonable distinction and handle most common cases to simplify.

Actions

Copy link

#10 [ruby-core:112271]

Updated by Eregon (Benoit Daloze) over 2 years ago

-2 means 1 required argument, and rest argument (e.g. p method(def m(a,*); end).arity => -2).

I think using this new behavior for -2 is too hacky.

For arity == 2, it seems more reasonable, and the examples above could use _1 + _2, etc.
Although changing for arity 2 could break code like a.merge(b) { |k,old| old }.

Actions

Copy link

#11 [ruby-core:112291]

Updated by matz (Yukihiro Matsumoto) over 2 years ago

Status changed from Open to Rejected

It looks nice at the first sight but may cause the compatibility issue as @Eregon (Benoit Daloze) mentioned.

Matz.

Actions

Copy link

Also available in: Atom PDF

Like1

Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0

Project

General

Profile

Ruby

Tags

Custom queries

Feature #19272

Hash#merge: smarter protocol depending on passed block arity

Updated by zverok (Victor Shepelev) over 2 years ago

Updated by zverok (Victor Shepelev) over 2 years ago

Updated by zverok (Victor Shepelev) over 2 years ago

Updated by sawa (Tsuyoshi Sawada) over 2 years ago

Updated by zverok (Victor Shepelev) over 2 years ago

Updated by nobu (Nobuyoshi Nakada) over 2 years ago

Updated by zverok (Victor Shepelev) over 2 years ago

Updated by nobu (Nobuyoshi Nakada) over 2 years ago

Updated by zverok (Victor Shepelev) over 2 years ago

Updated by Eregon (Benoit Daloze) over 2 years ago

Updated by matz (Yukihiro Matsumoto) over 2 years ago