Project

General

Profile

Bug #12970

== Equality of recursive sets fails

Added by kdeberk (Kevin de Berk) almost 3 years ago. Updated almost 2 years ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Target version:
-
ruby -v:
2.2.5, 2.3.3
[ruby-core:78256]

Description

Comparing recursive arrays and hashes with equal? contents (save for the recursive element) using == succeeds.
However, using == to compare two recursive sets with equal? contents fails. I expect that to succeed.

See the attached script. This is my output for 2.2.5 and 2.3.3:

[1, 2, 3] == [1, 2, 3]? -> true
[1, 2, 3, [...]] == [1, 2, 3, [...]]? -> true
{:a=>1, :b=>2} == {:a=>1, :b=>2}? -> true
{:a=>1, :b=>2, :c=>{...}} == {:a=>1, :b=>2, :c=>{...}}? -> true
#<Set:0x00000001f90fc8> == #<Set:0x00000001f90500>? -> true
#<Set:0x00000001f92968> == #<Set:0x00000001f91478>? -> false

Files

recursive_set_comparison.rb (871 Bytes) recursive_set_comparison.rb kdeberk (Kevin de Berk), 11/22/2016 04:31 PM
fix_recursive_sets.patch (2.39 KB) fix_recursive_sets.patch Esse (Piotr Szmielew), 12/23/2016 09:54 PM

Related issues

Related to Ruby master - Feature #6589: Set#rehashClosedActions

History

Updated by kdeberk (Kevin de Berk) almost 3 years ago

  • Description updated (diff)

Updated by marcandre (Marc-Andre Lafortune) almost 3 years ago

Took me a while to understand what was going on.

The issue is that the Set would need to be rehashed. Indeed, when you add it to itself, the element you are adding is modified too.

This is, in a way, similar to:

a = []
s = Set.new(a)
a << 42
s == Set.new([42]) # => false

If you add the following (horrible) two lines of code before calling test, the comparison works:

  recursive_set_1.instance_variable_get(:@hash).rehash
  recursive_set_2.instance_variable_get(:@hash).rehash

Sadly, there is not (yet) a Set#rehash method. Original request for it by yours truly: check #6589...

#3

Updated by marcandre (Marc-Andre Lafortune) almost 3 years ago

Updated by Esse (Piotr Szmielew) over 2 years ago

I've created patch for this issue.

Basically we need to rehash hash beneath set if (and only if) added object is self (therefore creating recursive set).

This patch add such rehash behaviour. This will also fix subset and superset behaviour to be coherent with set theory approach (recursive set is both self subset and superset).

In attachment you will find this patch for current ruby trunk.

Updated by marcandre (Marc-Andre Lafortune) over 2 years ago

That's not going to cut it... There are many other counter examples....

s = Set.new
t = Set.new
s << [s]
t << [t]
s == t # => false

or similarly:

s = Set.new
t = Set.new
a = Set.new([s])
b = Set.new([t])
s << a
t << b
s == t # => false

Updated by Esse (Piotr Szmielew) over 2 years ago

In this particular example even manually rehashing internal hash doesn't quite work...

s = Set.new
t = Set.new
s << [s]
t << [t]
s == t # => false
s.instance_variable_get(:@hash).rehash
t.instance_variable_get(:@hash).rehash
s == t # => false

So this isn't issue with rehashing, therefore my patch definitely wouldn't work here.

There is another solution - when comparing sets, use hash method calculate hashes and compare them. Code would look like this:

  def ==(other)
    if self.equal?(other)
      true
    elsif other.instance_of?(self.class)
      @hash.keys.hash == other.instance_variable_get(:@hash).keys.hash
    elsif other.is_a?(Set) && self.size == other.size
      other.all? { |o| @hash.keys.map(&:hash).include?(o.hash) }
    else
      false
    end
  end

With this approach, examples you provided works correctly.

Let me know if this is acceptable solution and should I provide patch.

Marc-Andre Lafortune wrote:

That's not going to cut it... There are many other counter examples....

s = Set.new
t = Set.new
s << [s]
t << [t]
s == t # => false

or similarly:

s = Set.new
t = Set.new
a = Set.new([s])
b = Set.new([t])
s << a
t << b
s == t # => false

Updated by marcandre (Marc-Andre Lafortune) over 2 years ago

Piotr Szmielew wrote:

In this particular example even manually rehashing internal hash doesn't quite work...

Right, looks like hash lookup (internally st_lookup) doesn't work with recursive values :-(

There is another solution - when comparing sets, use hash method calculate hashes and compare them. Code would look like this:

Comparing hashes is not sufficient for equality. You can have collisions, in particular recursive structures have the same hash... Moreover that code would be too slow to be acceptable.

I'm afraid there is not enough use for recursive Sets to have figure out a solution...

Updated by Esse (Piotr Szmielew) over 2 years ago

Also documentation at top of file says:

# * Set assumes that the identity of each element does not change
#   while it is stored.  Modifying an element of a set will render the
#   set to an unreliable state.

creating recursive sets definitely change element while it is stored. So I believe this is not really a bug, simply unexpected behaviour that actually is quite consistent with documentation.

Marc-Andre Lafortune wrote:

Piotr Szmielew wrote:

In this particular example even manually rehashing internal hash doesn't quite work...

Right, looks like hash lookup (internally st_lookup) doesn't work with recursive values :-(

There is another solution - when comparing sets, use hash method calculate hashes and compare them. Code would look like this:

Comparing hashes is not sufficient for equality. You can have collisions, in particular recursive structures have the same hash... Moreover that code would be too slow to be acceptable.

I'm afraid there is not enough use for recursive Sets to have figure out a solution...

Updated by knu (Akinori MUSHA) almost 2 years ago

  • Status changed from Open to Closed

This is not a bug but a feature. Set#reset has been added so you can use it if need be.

Also available in: Atom PDF