Project

General

Profile

Actions

Bug #18625

closed

ruby2_keywords does not unmark the hash if the receiving method has a *rest parameter

Added by Eregon (Benoit Daloze) 7 months ago. Updated 6 months ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Target version:
[ruby-core:107847]

Description

The code below shows the inconsistency.
In all cases the marked Hash is copied at call sites using some_call(*args), however for the case of splat it keeps the ruby2_keywords flag to true, and not false as expected.
This can be observed in user code and will hurt migration from ruby2_keywords to other ways of delegation ((...) and (*args, **kwargs)).
I believe this is another manifestation of #16466.

ruby2_keywords def foo(*args)
  args
end

def single(arg)
  arg
end

def splat(*args)
  args.last
end

def kwargs(**kw)
  kw
end

h = { a: 1 }
args = foo(**h)
marked = args.last
Hash.ruby2_keywords_hash?(marked) # => true

after_usage = single(*args)
after_usage == h # => true
after_usage.equal?(marked) # => false
p Hash.ruby2_keywords_hash?(after_usage) # => false

after_usage = splat(*args)
after_usage == h # => true
after_usage.equal?(marked) # => true, BUG, should be false
p Hash.ruby2_keywords_hash?(after_usage) # => true, BUG, should be false

after_usage = kwargs(*args)
after_usage == h # => true
after_usage.equal?(marked) # => false
p Hash.ruby2_keywords_hash?(after_usage) # => false

Hash.ruby2_keywords_hash?(marked) # => true

I'm implementing Ruby 3 kwargs in TruffleRuby and this came up as an inconsistency in specs.
In TruffleRuby it's also basically not possible to implement this behavior, because at a splat call site where we check for a last Hash argument marked as ruby2_keywords, we have no idea of which method will be called yet, and so cannot differentiate behavior based on that.

cc @jeremyevans0 (Jeremy Evans) @mame (Yusuke Endoh)


Related issues 1 (0 open1 closed)

Related to Ruby master - Bug #16466: `*args -> *args` delegation should be warned when the last hash has a `ruby2_keywords` flagClosedjeremyevans0 (Jeremy Evans)Actions
Actions #1

Updated by Eregon (Benoit Daloze) 7 months ago

  • Related to Bug #16466: `*args -> *args` delegation should be warned when the last hash has a `ruby2_keywords` flag added
Actions #2

Updated by Eregon (Benoit Daloze) 7 months ago

  • Description updated (diff)
Actions #3

Updated by Eregon (Benoit Daloze) 7 months ago

  • Description updated (diff)

Updated by jeremyevans0 (Jeremy Evans) 7 months ago

Based on @matz's comments (https://bugs.ruby-lang.org/issues/16466#note-3), I do not think this is a bug. Splatted arguments where the final argument is a flagged hash will have the hash converted to keywords if the method accepts keywords. If the method does not accept keywords, I don't think the behavior is specified.

I think this change is more consistent than the current code. I cannot think of a justification why single(*args) would return an unflagged copy, but splat(*args) would return the flagged hash. I think there are valid arguments for both returning flagged hash, with the argument that flagged hashes only affect methods that accept keywords, and other methods they are passed straight through. I also think this change returning an unflagged copy is justifiable, with the argument that foo(*args) where the final element of args is a flagged hash should always be treated as foo(*args[0...-1], **args[-1]), regardless of the definition of foo.

I recommend that we accept this change for 3.2. I'm not sure we should backport this to 3.0 or 3.1, since it could break existing code (unlikely, but possible).

In case this change is considered desirable (either as a bug fix or as a deliberate feature change), I have submitted a pull request for it: https://github.com/ruby/ruby/pull/5645 . The pull request does not require any additional allocations.

Updated by Eregon (Benoit Daloze) 7 months ago

Thank you for the PR, I think we should merge it.

Fixing this is important for multiple reasons:

  • Can cause issues when migrating to other forms of delegation
  • Makes the ruby2_keywords semantics confusing/inconsistent/more complex
  • May force other Ruby implementations to replicate this bug if not fixed

First of all, this can avoid bad surprises when switching from ruby2_keyword to (*args, **kwargs) or (...) (which makes sense e.g. when a gem no longer needs to support Ruby 2):
A simple example:

ruby2_keywords def foo(*args)
  bar(*args)
end

def bar(*args)
  baz(*args)
end

def baz(a:)
  a
end

p foo(a: 1) # => 1

This works on current master, but it should not, there is a missing ruby2_keywords on baz. The fact it does not fail may also confuse Ruby users (they might think the flag is kept across calls).

And if I translate this example to use (*args, **kwargs) instead of ruby2_keywords, it will be broken (on all versions):

def foo(*args, **kwargs)
  bar(*args, **kwargs)
end

def bar(*args)
  baz(*args)
end

def baz(a:)
  a
end

p foo(a: 1) # => in `baz': wrong number of arguments (given 1, expected 0; required keyword: a) (ArgumentError)

Second, as mentioned above this is an inconsistency and if any user observes this they will likely be very confused for good reasons.
The semantics of ruby2_keywords should stay as simple as possible, because it's already quite complex.
Notably:

  • The ruby2_keywords flag for a Hash instance never changes, the only way to "change" it is to create a new Hash instance with another value for the flag (already the case, great).
  • When the ruby2_keywords flag of a Hash is used at a foo(*args) call site, the callee receives a Hash without the flag set. This is necessary as keeping the flag would break code by continuing to treat the Hash as kwargs when it should not, and to provide an easy migration to other ways of delegation. This guarantee currently holds for all cases, except for the case reported in this issue.

And third since this behavior is observable to users, it might force other Ruby implementations to replicate this bug, which would be very unfortunate.
This behavior does not make sense semantically and can be tricky to implement.
The property that a call site does not need to know about a callee is a valuable one (both conceptually and in the implementation as it has performance implications), and this bug breaks that (the arguments are treated differently based on a specific callee).

Updated by Eregon (Benoit Daloze) 7 months ago

From the dev meeting log:
Conclusion:
matz: I think it is good to fix
@mame (Yusuke Endoh): It may break (a lot of?) existing programs, but hopefully easy to fix. I’ll discuss

Updated by jeremyevans0 (Jeremy Evans) 7 months ago

This cannot be merged yet as it currently causes a couple errors in the Bundler specs:

I think those specs need to be changed so they pass with either keywords or a positional hash.

Updated by Eregon (Benoit Daloze) 7 months ago

@jeremyevans0 (Jeremy Evans) There is a good chance this is missing ruby2_keywords calls in RSpec, could you try with https://github.com/rspec/rspec-mocks/pull/1464?

Updated by jeremyevans0 (Jeremy Evans) 7 months ago

Eregon (Benoit Daloze) wrote in #note-8:

@jeremyevans0 (Jeremy Evans) There is a good chance this is missing ruby2_keywords calls in RSpec, could you try with https://github.com/rspec/rspec-mocks/pull/1464?

I'm not sure how to get CI to run with that pull request. Can you please try instead?

Updated by Eregon (Benoit Daloze) 7 months ago

Testing Bundler seems to use a Gemfile, so I guess we can just make it point to my branch for testing: https://github.com/ruby/ruby/pull/5684

Updated by mame (Yusuke Endoh) 7 months ago

As far as I understand, this change will break the following code:

def target(a:)
  p a
end

# after the patch, ruby2_keywords is requried here
def bar(*args)
  target(*args)
end

ruby2_keywords def foo(*args)
  bar(*args)
end

p foo(a: 42) #=> 42 in Ruby 3.1
             #=> ArgumentError after the patch

After the patch, we need to add ruby2_keywords to the definition of bar.

According to our previous experience, rails is using this kind of multi-stage delegation pattern so much. Maybe we need to ask the core developers of rails to check and support the change. cc/ @kamipo (Ryuta Kamizono)

Updated by Eregon (Benoit Daloze) 7 months ago

Yes, that's correct, and that code would also break when migrating to other form of delegations (if not changing bar too).

I don't understand why we had these semantics in the first place, I think there are hard to understand for anyone (it seems it was unintentional).
They might lead people to think the ruby2_keywords flag is kept forever, but that is not the case.
The rule is the ruby2_keywords flag is "reset" (i.e., the Hash in the callee is not flagged) once used, e.g., on foo(*flagged_args).
This always holds, except for the case where the callee has a rest arg (this issue).

@mame (Yusuke Endoh) noticed as well and I said it back just after the 2.7 release that we should have fixed this (it would probably have been easier), but it seems I was not heard: #16466.
I believe this is worth fixing, for simpler semantics and for being able to migrate to other forms of delegations without major troubles.
It is easier to add some ruby2_keywords now than while migrating to (*args, **kwargs)/(...) figure out several ruby2_keywords were missing and have to review all places which do delegation.
It is likely some gems which do lots of delegation might miss a few ruby2_keywords, like rspec-mocks does.
Adding missing ruby2_keywords is not particularly hard, adding puts caller at the end is enough to find all of them along a delegation chain.

Updated by Eregon (Benoit Daloze) 6 months ago

FWIW, I found that Rails is currently missing some ruby2_keywords due to this bug.
For example when running rspec-rails specs (cd rspec-rails && bundle && bundle exec rspec spec/rspec/rails/configuration_spec.rb), I counted about 7 missing ruby2_keywords for places which delegate with *args:

./spec/rspec/rails/configuration_spec.rb:273:in `welcome'
gems/actionpack-6.0.4.7/lib/abstract_controller/base.rb:195:in `process_action'
gems/actionpack-6.0.4.7/lib/abstract_controller/callbacks.rb:42:in `block in process_action'
gems/activesupport-6.0.4.7/lib/active_support/callbacks.rb:101:in `run_callbacks'
gems/actionpack-6.0.4.7/lib/abstract_controller/callbacks.rb:41:in `process_action'
gems/actionpack-6.0.4.7/lib/abstract_controller/base.rb:136:in `process'
gems/actionmailer-6.0.4.7/lib/action_mailer/rescuable.rb:25:in `block in process'
gems/actionmailer-6.0.4.7/lib/action_mailer/rescuable.rb:17:in `handle_exceptions'
gems/actionmailer-6.0.4.7/lib/action_mailer/rescuable.rb:24:in `process'
gems/actionview-6.0.4.7/lib/action_view/rendering.rb:39:in `process'
gems/actionmailer-6.0.4.7/lib/action_mailer/base.rb:637:in `block in process'
gems/activesupport-6.0.4.7/lib/active_support/notifications.rb:180:in `block in instrument'
gems/activesupport-6.0.4.7/lib/active_support/notifications/instrumenter.rb:24:in `instrument'
gems/activesupport-6.0.4.7/lib/active_support/notifications.rb:180:in `instrument'
gems/actionmailer-6.0.4.7/lib/action_mailer/base.rb:636:in `process'
gems/actionmailer-6.0.4.7/lib/action_mailer/message_delivery.rb:124:in `block in processed_mailer'

I think this means other Ruby implementations will be forced to replicate this bug (and its performance and significant complexity implications) as long as they target Ruby 3.0/3.1 compatibility, as multiple gems will rely on it without knowing it, which is very unfortunate, and it will take some time until such gems have the fix merged and have a release with it.

I still do believe it is worth fixing this for 3.2 and fix the gems, that will make it easier for gems when they can assume Ruby 3+ and migrate to (*args, **kwargs)/(...).

I also cannot hide the fact that I am very disappointed that #16466 was not fixed back then, before the Ruby 3 release. It seemed a straightforward bug with bad consequences if unfixed (and indeed it turns out to be the case), and it was reported days after the 2.7.0 release, so plenty of time before 3.0 or even 2.7.1. I'm afraid matz did not understand the situation in https://bugs.ruby-lang.org/issues/16466#note-3, in fact the fix does help to find the correct places for ruby2_keywords. But we cannot return in the past to change things, so let's focus on now and the future.

Actions #14

Updated by Eregon (Benoit Daloze) 6 months ago

  • Description updated (diff)

Updated by Eregon (Benoit Daloze) 6 months ago

I also noticed another inconsistency on 3.0.3 and current master: -> *args { args.last }.call(*args) does not copy a marked Hash, but send(:foo, *args) (foo taking *args) does copy it (even though both are implemented in C and takea rest parameter).
@jeremyevans0's PR fixes that too, by always copying and unmarking, as expected.

Updated by Eregon (Benoit Daloze) 6 months ago

I merged https://github.com/ruby/ruby/pull/5684, which is Jeremy's PR + a needed update of rspec-mocks + a NEWS entry.
This is the only way to really assess the impact on gems/apps depending on this bug, and to let gems/apps to find and add the missing ruby2_keywords (asking gems or even rails to use a branch of ruby is a lot of overhead).

Because this was a bug, and there is no other way to fix it, and this fix will help migration to other forms of delegation, I think it is very unlikely we would revert this change, but it remains a possibility until the 3.2 release.

A good technique to find the potentially-missing ruby2_keywords is to run the test suite, for where it fails find the last method which must receive keyword arguments,
use puts nil, caller, nil there, and check each method on the call chain which must delegate keywords is correctly marked as ruby2_keywords.

Actions #17

Updated by Eregon (Benoit Daloze) 6 months ago

  • Status changed from Open to Closed
Actions #18

Updated by Eregon (Benoit Daloze) 6 months ago

  • Target version set to 3.2
Actions

Also available in: Atom PDF