Project

General

Profile

Actions

Bug #19113

closed

Inconsistency in retention of compare_by_identity flag in Hash methods

Added by jeremyevans0 (Jeremy Evans) over 1 year ago. Updated 12 months ago.

Status:
Closed
Assignee:
-
Target version:
-
ruby -v:
ruby 3.2.0dev (2022-11-07T17:29:28Z master 9001e53e68) [x86_64-openbsd7.2]
[ruby-core:110663]

Description

Hash.[] and Hash.ruby2_keywords_hash retain the compare_by_identity flag for non-empty hashes, but do not retain it for empty hashes:

hs = [{}.compare_by_identity, {:a=>1}.compare_by_identity]
hs.map{|h| Hash[h].compare_by_identity?}
# => [false, true]
hs.map{|h| Hash.ruby2_keywords_hash(h).compare_by_identity?}
# => [false, true]

This inconsistency seems like a bug.

Hash#compact always drops the compare_by_identity flag, but it is documented as returning a copy of self, implying the compare_by_identity flag is kept (since #dup and #clone retain the flag).

{}.compare_by_identity.compact.compare_by_identity?
# => false

I'm not sure whether is a bug, because it is consistent, but I think retaining the flag makes more sense.

I'll try to work on a fix for both of these issues tomorrow.

Updated by jeremyevans0 (Jeremy Evans) over 1 year ago

I think the following behavior makes the most sense:

  • Hash.[] should never retain the compare_by_identity flag. It doesn't copy the default value/proc, so retaining the compare_by_identity flag does not make sense.
  • Hash.ruby2_keywords_hash should always retain the compare_by_identity flag. It copies the default value/proc, so retaining the compare_by_identity flag makes sense.
  • Hash#compact should copy the default value/proc and retain the compare_by_identity flag. It currently does neither. However, the documentation says it returns a copy of the receiver, which implies the result should have the same default value/proc and compare_by_identity flag as the receiver.

I've submitted a pull request for these changes: https://github.com/ruby/ruby/pull/6702

Updated by ko1 (Koichi Sasada) over 1 year ago

I think Hash[] remains all keys on the new Hash so compare_by_identity should be remained.

Updated by headius (Charles Nutter) over 1 year ago

jeremyevans0 (Jeremy Evans) wrote in #note-1:

I think the following behavior makes the most sense:

  • Hash.[] should never retain the compare_by_identity flag. It doesn't copy the default value/proc, so retaining the compare_by_identity flag does not make sense.

I disagree. The contents of the other hash will have been populated using identity comparison rather than equality comparison. If that characteristic does not propagate to the new hash, any == keys will collide or else the set of keys will have to change. Is Hash.[] intended to make a copy, or a new hash populated with a subset of the original keys using non-identity, non-default semantics?

I agree with your other points.

Updated by headius (Charles Nutter) over 1 year ago

FWIW I ran into this while implementing Ruby 3.1 identhash semantics in JRuby, and then realized this is now one of the few "copy" paths that does not propagate identity comparison.

Updated by jeremyevans0 (Jeremy Evans) over 1 year ago

headius (Charles Nutter) wrote in #note-3:

jeremyevans0 (Jeremy Evans) wrote in #note-1:

I think the following behavior makes the most sense:

  • Hash.[] should never retain the compare_by_identity flag. It doesn't copy the default value/proc, so retaining the compare_by_identity flag does not make sense.

I disagree. The contents of the other hash will have been populated using identity comparison rather than equality comparison. If that characteristic does not propagate to the new hash, any == keys will collide or else the set of keys will have to change. Is Hash.[] intended to make a copy, or a new hash populated with a subset of the original keys using non-identity, non-default semantics?

It's explicitly documented as returning a new hash, not a copy: Returns a new Hash object populated with the given objects.

The fact that it doesn't copy the default value/proc indicates to me that the compare_by_identity flag should not be copied either. However, I don't feel strongly regarding this. We can make it so the compare_by_identity flag is always copied. We do need to make some change, because the current behavior is inconsistent.

Updated by headius (Charles Nutter) over 1 year ago

Even though it's a "new Hash", it is supposed to be populated with "the given objects". If losing identity comparison means some keys don't get populated, I would consider that broken.

Updated by jeremyevans0 (Jeremy Evans) over 1 year ago

headius (Charles Nutter) wrote in #note-6:

Even though it's a "new Hash", it is supposed to be populated with "the given objects". If losing identity comparison means some keys don't get populated, I would consider that broken.

If you turn the hash into an array, you have the same given objects, but different results:

h = {'a'=>1}.compare_by_identity
h['a'] = 2
Hash[h]      # {'a'=>1, 'a'=>2}.compare_by_identity
Hash[h.to_a] # {'a'=>2}

Not respecting the compare_by_identity flag (as my PR does) would make Hash[h] == Hash[h.to_a].

I would say that populated with the given objects does not imply resulting in the same set of keys. Also, given objects (not given object) implies it looks at the content of the hash, but not the hash object itself (hence why default value/proc is not copied).

Here's the pseudocode I would consider based on the documentation:

# new hash
output_hash = {}
# populated with the given objects
input_hash_or_array.each do |k, v|
  output_hash[k] = v
end
output_hash

There are certainly valid arguments for both sides, so we'll have to see what @matz (Yukihiro Matsumoto) decides.

Updated by Eregon (Benoit Daloze) over 1 year ago

The way I see it, Hash[*args] converts args to a Hash, and args is only considered as key-value pairs. So if args = [some_Hash], it's just one of the representation of key-value pairs but the behavior shouldn't be special with a Hash argument otherwise.
IOW I agree with Jeremy here. To copy a Hash there is .dup/.clone.

Actions #10

Updated by jeremyevans (Jeremy Evans) 12 months ago

  • Status changed from Open to Closed

Applied in changeset git|d3197def882b47e7c57cdddfe8d62f62fef9d3f7.


Do not copy compare_by_identity flag for non-empty hashes in Hash.[]

It wasn't copied for empty hashes, and Hash.[] doesn't copy the
default value, so copying the compare_by_identity flag does not
make sense.

Partially Fixes [Bug #19113]

Actions

Also available in: Atom PDF

Like1
Like1Like0Like0Like0Like0Like0Like0Like0Like0Like0