Project

General

Profile

Bug #15078

Hash splat of empty hash should not create a positional argument.

Added by marcandre (Marc-Andre Lafortune) 7 months ago. Updated 6 months ago.

Status:
Open
Priority:
Normal
Target version:
-
ruby -v:
ruby 2.6.0dev (2018-08-27 trunk 64545) [x86_64-darwin15]
[ruby-core:88867]

Description

Looks like #10856 is not completely fixed, but I can't reopen it

def foo(*args); args; end
foo(**{}) # => []
foo(**Hash.new) # => [{}], should be []

Related issues

Related to Ruby trunk - Bug #10856: Splat with empty keyword args gives unexpected resultsClosedActions
Related to Ruby trunk - Bug #15052: must not optimize `foo(**{})` outOpenActions
Related to Ruby trunk - Feature #14183: "Real" keyword argumentOpenActions

History

#1

Updated by marcandre (Marc-Andre Lafortune) 7 months ago

  • Related to Bug #10856: Splat with empty keyword args gives unexpected results added

Updated by mame (Yusuke Endoh) 7 months ago

It is not so trivial what it should be. If you look at only foo(**{}) #=> [{}] itself, it might be non-intuitive, indeed. However, IMO, this is a consequence of the current weird spec of keyword arguments (a keyword hash is passed as a normal last argument). Unless the spec itself changes, it would be more consistent for foo(**{}) to return [{}].

See the following example:

def foo(*args)
  p args
end
foo(**{k1: 1, k2: 2, k3: 3}) #=> [{:k1=>1, :k2=>2, :k3=>3}]
foo(**{k1: 1, k2: 2})        #=> [{:k1=>1, :k2=>2}]
foo(**{k1: 1})               #=> [{:k1=>1}]
foo(**{})                    #=> [] # surprising, should be [{}]

The number of arguments changes depending upon the value. I think such a behavior is error-prone. It should not occur, unless the caller uses foo(*ary) explicitly. People will not expect the number change in foo(**hsh).

In fact, the fix you proposed will make it difficult to test the following program:

def f2(h)
end

def f1(foo, bar)
  h = {}
  h[:foo] = :foo if foo
  h[:bar] = :bar if bar
  f2(**h)
end

It will break only when f1(false, false).

Do you think it is bad to accept a keyword hash (f2(**h)) as a normal parameter (def f2(h))? Yes, I agree. So I'm proposing #14183. If keyword arguments are split from normal arguments, we can fix this issue gracefully: we can always ignore **{} safely. If the current weird spec is kept, I don't think it is a good idea to "fix" this ad-hocly just because it might be non-intuitive.

#3

Updated by mame (Yusuke Endoh) 7 months ago

  • Related to Bug #15052: must not optimize `foo(**{})` out added
#4

Updated by mame (Yusuke Endoh) 7 months ago

Updated by marcandre (Marc-Andre Lafortune) 7 months ago

First, I hope we can agree that any(**{}) and any(**Hash.new) should have the exact same result (for any definition of any).

So the question is: what should the result be?

For me, writing **hash means "do as if I had written that hash inline".
I.e. any(**{k1: 1, k2: 2}) should have the exact same result as any(k1: 1, k2: 2)
In the same way, any(**{}) should have the exact same result as any()

Advantages I see:
a) Allows two ways to safely forward all arguments to another method:

def middle(*args, &b)
  forward_to(*args, &b)
end

# or equivalently
def middle(*args, **options, &b)
  forward_to(*args, **options, &b)
end

In either cases, calling middle(...) has same effect as forward_to(...).

My guess is that a huge number of method_missing definitions (like that of lib/delegate: https://github.com/ruby/ruby/blob/trunk/lib/delegate.rb#L78-L89) use either formulas to forward calls, for example.

b) Compatible with current behavior for most methods (those without positional rest argument)

empty_hash = {}
def foo(a, b); end; foo(1, 2, **empty_hash) # => no error
def foo(a, b=nil); end; foo(1, 2, **empty_hash) # => no error

Of course, one wouldn't write such code, but this is related to the previous point.

mame (Yusuke Endoh) wrote:

Do you think it is bad to accept a keyword hash (f2(**h)) as a normal parameter (def f2(h))? Yes, I agree.

I am convinced that the vast majority of rubyists would either write def f2(options = {}) (old style), or def f2(**options) (new style). I think it is good to accept f2(**h) as a normal parameter def f2(h = {}).

So I'm proposing #14183.

I understand the goal and I completely agree that if we were to re-create Ruby from scratch it would be great. But Ruby is more than 20 years old with over 100k gems. Because of that my position has to be strongly against the proposal.

Now even if the proposal was accepted and implemented, then **{} would still never create a positional argument! So I believe that you actually agree with me :-)

Updated by mame (Yusuke Endoh) 7 months ago

marcandre (Marc-Andre Lafortune) wrote:

First, I hope we can agree that any(**{}) and any(**Hash.new) should have the exact same result (for any definition of any).

Of course :-)

For me, writing **hash means "do as if I had written that hash inline".
I.e. any(**{k1: 1, k2: 2}) should have the exact same result as any(k1: 1, k2: 2)
In the same way, any(**{}) should have the exact same result as any()

What do you think about #15052?

def foo(opt = "opt", **hsh)
  p [opt, hsh]
end

foo({}, **{})  #=> expected: [{}, {}], actual: ["opt", {}]

According to your interpretation, foo({}, **{}) is equal to foo({}), so the current result is correct. We need to write foo({}, {}) to get the expected result.
I added ** explicitly to make sure that the second hash is a keyword hash, which made a bug. This spoils the feature of double splat, I think.

Advantages I see:
a) Allows two ways to safely forward all arguments to another method:

Yes, good point. If keyword arguments are separated from normal ones, we need to always forward three component: def foo(*a, **h, &b); bar(*a, **h, &b); end. However, the code does not always work on the current semantics, as you said. We need a migration path, or a coding style that works on both 2.X and 3.X, so this is a severe problem.

Surprisingly, in 2010, matz predicted this issue and proposed a new syntax for delegation in #3447: def foo(...); bar(...); end. Unfortunately, it has not been accepted yet, though.

Now even if the proposal was accepted and implemented, then **{} would still never create a positional argument! So I believe that you actually agree with me :-)

I'd like to agree with you, but also really like to fix #15052. Do you find a good semantics of **hash to satisfy both this ticket and that one?

Updated by marcandre (Marc-Andre Lafortune) 7 months ago

  • Assignee changed from nobu (Nobuyoshi Nakada) to matz (Yukihiro Matsumoto)

mame (Yusuke Endoh) wrote:

I'd like to agree with you, but also really like to fix #15052. Do you find a good semantics of **hash to satisfy both this ticket and that one?

This is a long post, sorry.

  • Summary *

I agree that in your "two hash" example, foo({}, **{}) should return [{}, {}]. I believe it is possible to have semantics that allow that. I am proposing such a semantic below that would have the following consequences:

h = {x: 42}
def with_keyword(*args, **options)
  [args, options]
end

with_keyword(h, **{}) # => [[h], {}] (as in 15052, currently [[], h])   

def no_keyword(*args)
  args
end
no_keyword(h, **{}) # => [h] (currently: [h, {}])

The current behavior is actually problematic, in particular with full forwarding (of the type *args, **options), while I believe my proposal fixes it.
This paves the way for a stricter handling of keyword parameters (like your proposal #14183).

  • Proposed semantics *

Ruby 2.x will:
1) promote the last positional argument as keyword argument whenever possible (as is currently does)
2) demote keywords arguments to a positional argument if needed (as is currently does, but maybe differently for empty case)

In detail:
1) When calling method(a, b, last), then last will be promoted to a keyword argument if possible, i.e. if:
a) method takes keyword arguments (any of key:, key: val, or **options in signature)
b) and all mandatory positional arguments of method are provided (here by a and b)
c) and last is hash-like (i.e. responds_to? :to_hash)
d) and all keys of last.to_hash are symbols

Otherwise, last will remain a positional argument.
This is current behavior and should remain unchanged in Ruby 2.x, maybe open for removal in Ruby 3.

2) When calling method(a, b, key: value) or method(a, b, **hash) or a combination of these, the keyword arguments (here {key: value} or hash) will be demoted to a positional argument if needed, i.e.
a) if method does not accept keyword arguments
b) and they are non empty

This is slightly different than currently in some cases for **{}.

  • Limitation *

While your example in "two hash" example would still work, it can't be forwarded with the naive approach:

def with_keyword(*args, **options)
  [args, options]
end

def forward(*args)
  with_keyword(*args)
end

forward(h, **{}) # => [[], h]   not same as with_keyword(h, **{}) # => [[h], {}]

Only forwarding with the full approach would work.

I believe that very few real life cases would be problematic. If your proposal is accepted, or half of it like I recommend, this won't be an issue.

  • The future *

I think the reason why we haven't had too many bug reports with **{} yet is that the usual generic ways of forwarding are either the delegate library or ActiveSupport's delegate :method, to: receiver.

Both currently use the naive forwarding (*args) instead of full forwarding (*args, **options). So neither actually deal with **{}.

The naive forwarding relies on promotion of last positional argument to keyword arguments. Before we can remove that automatic promotion, we need to make sure that full forwarding works well.

If **{} creates a positional argument, this makes any normal method with optional arguments non compatible with full forwarding! For example:

def multiply(*nbs)
  nbs.inject(1, :*)
end

def forward(*a, **o)
  multiply(*a, **o)
end

forward(1, 2, 3) # => should call multiply(1, 2, 3) but currently calls multiply(1, 2, 3, {}) which raises error.

I believe it is absolutely imperative that the code above works, in Ruby 2.x and 3.x.

For that reason I believe that **{} should never create a positional argument.

Updated by mame (Yusuke Endoh) 6 months ago

Marc-Andre,

marcandre (Marc-Andre Lafortune) wrote:

  • Proposed semantics *

I thought it looked good. However, I introduced it to some committers at the developers' meeting, and akr-san showed it was still incomplete:

def target(*args)
  p args
end

def forward(*args, **kw, &blk)
  target(*args, **kw, &blk)
end

target(1, 2, 3, {})  #=> [1, 2, 3, {}]
forward(1, 2, 3, {}) #=> [1, 2, 3] on the semantics you proposed ([1, 2, 3, {}] on the current behavior)

Akr-san said that it would be impossible to create a "perfect" semantics to satisfy all cases. In Ruby 2.X, he said that **empty_hash should be just ignored, and that foo({}, **{}) should be always equal to foo({}), even if def foo(opt=42, **kw) receives opt=42. (However, matz showed reluctance to this behavior. So, the solution for this ticket is not decided yet.)

Also, akr-san and some committers agreed with my proposal of "real" keyword arguments to solve all the kind of design flaws. (Surprisingly, there was no committer there who was against my proposal, even though it breaks compatibility. Matz also looked positive, but he has not made the decision yet.)
It would be impossible to write a delegation code that works perfectly on both 2.X and 3.0, but it would work good enough on almost all practical cases, akr-san said.

Akr-san and some committers said that a hash or keyword should be decided syntactically based on => or :. foo(key: 1) should pass a keyword andfoo(:key => 1)should pass a hash (i.e.,foo({:key => 1})`). I'll consider it and update my proposal.

Updated by marcandre (Marc-Andre Lafortune) 6 months ago

mame (Yusuke Endoh) wrote:

Akr-san said that it would be impossible to create a "perfect" semantics to satisfy all cases.

I'm sorry if I sounded like I had a "complete" solution, since Akr-san is right, there is no such perfect semantics. Proof

# Let foo be some method of unknown signature

# Forward bar to foo:
def bar(*args, **opt, &b)
  foo(*args, **opt, &b)
end

# If forwarding works, we have: 
bar({}) === foo({})
bar() === foo()
# But by definition above, we have
bar({}) === foo(**{})   # Since args = [], opt = {} (at least in Ruby 2.x)
bar()   === foo(**{})   # Since args = [], opt = {}
# So, for any `foo`, we would need
foo() === foo({})  # Not necessarily true...

In Ruby 2.X, he said that **empty_hash should be just ignored, and that foo({}, **{}) should be always equal to foo({}), even if def foo(opt=42, **kw) receives opt=42. (However, matz showed reluctance to this behavior. So, the solution for this ticket is not decided yet.)

I'm glad he and I agree on the fact that is should be ignored most of the time. It could also not be ignored for methods that accept keyword parameters, which I believe is more logical and closer to what Ruby 3.0 would do.

Also, akr-san and some committers agreed with my proposal of "real" keyword arguments to solve all the kind of design flaws. (Surprisingly, there was no committer there who was against my proposal, even though it breaks compatibility. Matz also looked positive, but he has not made the decision yet.)
It would be impossible to write a delegation code that works perfectly on both 2.X and 3.0, but it would work good enough on almost all practical cases, akr-san said.

Please consider my "vote" as in favor of half your proposal, and strong opposition to the other half :-)

Akr-san and some committers said that a hash or keyword should be decided syntactically based on => or :.

I agree that it should be syntactical. We can still decide that foo(:key => 1) is a keyword argument though. I'm undecided about this. Only important thing is that foo(some_var => 1) should never be considered a keyword argument, even if some_var is a Symbol.

Updated by mame (Yusuke Endoh) 6 months ago

marcandre (Marc-Andre Lafortune) wrote:

I'm sorry if I sounded like I had a "complete" solution, since Akr-san is right, there is no such perfect semantics.

Don't mind; I had considered your proposal in a few days, but I couldn't find the counterexample :-) The current semantics is just too complex.

In Ruby 2.X, he said that **empty_hash should be just ignored, and that foo({}, **{}) should be always equal to foo({}), even if def foo(opt=42, **kw) receives opt=42. (However, matz showed reluctance to this behavior. So, the solution for this ticket is not decided yet.)

I'm glad he and I agree on the fact that is should be ignored most of the time. It could also not be ignored for methods that accept keyword parameters, which I believe is more logical and closer to what Ruby 3.0 would do.

Yes. Your proposal looks the best semantics so far for Ruby 2.X, at least to me. Note that akr-san is also reasonable: from 2.0, a method call has always passed the same value array and a block, without depending upon the definition of the method, he said. Your proposal breaks the principle. (Honestly, I'm unsure if the principle is still valid or not. Too complex...)

Please consider my "vote" as in favor of half your proposal, and strong opposition to the other half :-)

Did you see my problem of the note 29 in #14183? Unfortunately, prohibiting the half does not solve the problem at all.

Updated by marcandre (Marc-Andre Lafortune) 6 months ago

mame (Yusuke Endoh) wrote:

Yes. Your proposal looks the best semantics so far for Ruby 2.X, at least to me. Note that akr-san is also reasonable: from 2.0, a method call has always passed the same value array and a block, without depending upon the definition of the method, he said. Your proposal breaks the principle. (Honestly, I'm unsure if the principle is still valid or not. Too complex...)

Indeed, my proposal breaks that (on purpose) :-)

Please consider my "vote" as in favor of half your proposal, and strong opposition to the other half :-)

Did you see my problem of the note 29 in #14183? Unfortunately, prohibiting the half does not solve the problem at all.

Yes, I saw it. It's definitely true my proposal wouldn't fix it (until Ruby 42.0.0 ;-) ). I'm not very concerned about it though. Care would still be required when changing a method from positional arguments only to keyword arguments, if the last positional argument could be a hash. That's a rare case, and you can write your code to avoid the issue.

I feel the incompatibility with key: value no longer being converted to positional if need be is simply too big for the gain.

Updated by akr (Akira Tanaka) 6 months ago

mame (Yusuke Endoh) wrote:

def target(*args)
  p args
end

def forward(*args, **kw, &blk)
  target(*args, **kw, &blk)
end

target(1, 2, 3, {})  #=> [1, 2, 3, {}]
forward(1, 2, 3, {}) #=> [1, 2, 3] on the semantics you proposed ([1, 2, 3, {}] on the current behavior)

Akr-san said that it would be impossible to create a "perfect" semantics to satisfy all cases. In Ruby 2.X, he said that **empty_hash should be just ignored, and that foo({}, **{}) should be always equal to foo({}), even if def foo(opt=42, **kw) receives opt=42. (However, matz showed reluctance to this behavior. So, the solution for this ticket is not decided yet.)

How about def m(**kw) end binds kw to nil if the last Hash argument is not taken as keyword arguments and
m(**{}) add {} and m(**nil) don't add anything to arguments.

{}/nil distinguish there are keyword arguments or not.
This information is what lacks to implement perfect delegation method.

In this behavior, following two should be same behavior (in Ruby 2.X), I think.

def f(*a, *kw) g(*a, *kw) end
def f(*a) g(*a) end

Updated by akr (Akira Tanaka) 6 months ago

akr (Akira Tanaka) wrote:

In this behavior, following two should be same behavior (in Ruby 2.X), I think.

def f(*a, *kw) g(*a, *kw) end
def f(*a) g(*a) end

Oops. def f(*a, *kw) g(*a, *kw) end was wrong.
I wanted to write def f(*a, **kw) g(*a, **kw) end

Also, I forgot to describe about symbol/non-symbol key Hash splitting.

Symbol/non-symbol key Hash splitting breaks delegation with
def f(*a, **kw) g(*a, **kw) end.
It can be fixed with removing the Hash splitting, or
move the Hash splitting mechanism to caller from callee.
I think former is more compatible.

I.e. following script should show [{:a=>"b", 1=>2}] twice.
(a in g should be [{:a=>"b", 1=>2}] and
kw should be nil)

% ruby -e '
def f(*a) p a end
def g(*a, **kw) f(*a, **kw) end
f(a: "b", 1 => 2)
g(a: "b", 1 => 2)
'
[{:a=>"b", 1=>2}]
[{1=>2}, {:a=>"b"}]

Updated by mame (Yusuke Endoh) 6 months ago

akr (Akira Tanaka) wrote:

How about def m(**kw) end binds kw to nil if the last Hash argument is not taken as keyword arguments and
m(**{}) add {} and m(**nil) don't add anything to arguments.

I agree that it is one of the reasonable design choices, but does it break the following code?

def foo(**kw)
  kw[:key]
end

foo()

Applying it in Ruby 2.X would bring a big incompatibility issue, I guess. Rather, it might be worse than separation of keyword argument because the incompatibility is data-dependent: it can not always be fixed syntactically.

def foo(**kw)
  @option = kw
end

...

def bar
  @option[:key] #=> Null pointer
end

Updated by marcandre (Marc-Andre Lafortune) 6 months ago

akr (Akira Tanaka) wrote:

How about def m(**kw) end binds kw to nil if the last Hash argument is not taken as keyword arguments and
m(**{}) add {} and m(**nil) don't add anything to arguments.

Very clever :-)

As Mame-san points out, it would be a source of incompatibility. I would really like to know how big though.

I imagine that most cases of **h capture are used to forward them (so no incompatibility).

I looked at DeepCover's code:
34 cases of def foo(... **option)
26 are ok (about 22 simply forward **option)
8 are incompatible (2 of which were simply missing hash splat)
An example: https://github.com/deep-cover/deep-cover/blob/master/core_gem/lib/deep_cover/tools/content_tag.rb#L6-L9

Updated by akr (Akira Tanaka) 6 months ago

mame (Yusuke Endoh) wrote:

akr (Akira Tanaka) wrote:

How about def m(**kw) end binds kw to nil if the last Hash argument is not taken as keyword arguments and
m(**{}) add {} and m(**nil) don't add anything to arguments.

I agree that it is one of the reasonable design choices, but does it break the following code?

def foo(**kw)
  kw[:key]
end

foo()

Yes. I hope that the incompatibility is acceptable.

If it is not acceptable, I have second idea:
We can use a special frozen empty Hash object instead of nil.

NO_KEYWORD_ARGUMENTS = {}.freeze

In this idea, the condition of "no keyword argument" is represented as
NO_KEYWORD_ARGUMENTS.equal?(x) instead of nil.equal?(x).

It is still incompatible because it is not modifiable, though.

def foo(**kw) kw[0] = 1 end; foo()

If kw is bind to NO_KEYWORD_ARGUMENTS, kw[0] = 1 raises FrozenError.

Also, more importantly, it adds a new constant and
we (Ruby programmers) need to remember the name.

Updated by akr (Akira Tanaka) 6 months ago

akr (Akira Tanaka) wrote:

If it is not acceptable, I have second idea:

My third idea is to use an instance variable to distinguish
the two kind of {}.

We can create {} with a special instance variable to distinguish them.

h = {}
h.instance_variable_set(:@no_keyword_arguments, true)

It can be detected as h.instance_variable_defined?(:@no_keyword_arguments).

This is almost compatible, I think.

Updated by akr (Akira Tanaka) 6 months ago

My fourth idea: FL_USERn flag in `{}'.

Updated by marcandre (Marc-Andre Lafortune) 6 months ago

akr (Akira Tanaka) wrote:

We can create {} with a special instance variable to distinguish them.

I'm impressed by your creativity :-)

So, would we have the following?

def foo(**opt)
  opt.instance_variable_get(:@no_keyword_arguments)
end

foo() # => true
foo({}) # => true ?
foo(**{}) # => false
special = {}; special.instance_variable_set(:@no_keyword_arguments, true)
foo(**special) # => true
foo(special) # => true

So this would influence how if it gets converted to positional argument, right?

def bar(*args)
  args
end

bar(**{}) # => [{}]
bar(**special) # => []

Also available in: Atom PDF