Bug #12970
closed== Equality of recursive sets fails
Description
Comparing recursive arrays and hashes with equal? contents (save for the recursive element) using == succeeds.
However, using == to compare two recursive sets with equal? contents fails. I expect that to succeed.
See the attached script. This is my output for 2.2.5 and 2.3.3:
[1, 2, 3] == [1, 2, 3]? -> true
[1, 2, 3, [...]] == [1, 2, 3, [...]]? -> true
{:a=>1, :b=>2} == {:a=>1, :b=>2}? -> true
{:a=>1, :b=>2, :c=>{...}} == {:a=>1, :b=>2, :c=>{...}}? -> true
#<Set:0x00000001f90fc8> == #<Set:0x00000001f90500>? -> true
#<Set:0x00000001f92968> == #<Set:0x00000001f91478>? -> false
Files
Updated by kdeberk (Kevin de Berk) about 8 years ago
- Description updated (diff)
Updated by marcandre (Marc-Andre Lafortune) about 8 years ago
Took me a while to understand what was going on.
The issue is that the Set would need to be rehashed. Indeed, when you add it to itself, the element you are adding is modified too.
This is, in a way, similar to:
a = []
s = Set.new(a)
a << 42
s == Set.new([42]) # => false
If you add the following (horrible) two lines of code before calling test
, the comparison works:
recursive_set_1.instance_variable_get(:@hash).rehash
recursive_set_2.instance_variable_get(:@hash).rehash
Sadly, there is not (yet) a Set#rehash
method. Original request for it by yours truly: check #6589...
Updated by marcandre (Marc-Andre Lafortune) about 8 years ago
- Related to Feature #6589: Set#rehash added
Updated by Esse (Piotr Szmielew) almost 8 years ago
- File fix_recursive_sets.patch fix_recursive_sets.patch added
I've created patch for this issue.
Basically we need to rehash hash beneath set if (and only if) added object is self (therefore creating recursive set).
This patch add such rehash behaviour. This will also fix subset and superset behaviour to be coherent with set theory approach (recursive set is both self subset and superset).
In attachment you will find this patch for current ruby trunk.
Updated by marcandre (Marc-Andre Lafortune) almost 8 years ago
That's not going to cut it... There are many other counter examples....
s = Set.new
t = Set.new
s << [s]
t << [t]
s == t # => false
or similarly:
s = Set.new
t = Set.new
a = Set.new([s])
b = Set.new([t])
s << a
t << b
s == t # => false
Updated by Esse (Piotr Szmielew) almost 8 years ago
In this particular example even manually rehashing internal hash doesn't quite work...
s = Set.new
t = Set.new
s << [s]
t << [t]
s == t # => false
s.instance_variable_get(:@hash).rehash
t.instance_variable_get(:@hash).rehash
s == t # => false
So this isn't issue with rehashing, therefore my patch definitely wouldn't work here.
There is another solution - when comparing sets, use hash method calculate hashes and compare them. Code would look like this:
def ==(other)
if self.equal?(other)
true
elsif other.instance_of?(self.class)
@hash.keys.hash == other.instance_variable_get(:@hash).keys.hash
elsif other.is_a?(Set) && self.size == other.size
other.all? { |o| @hash.keys.map(&:hash).include?(o.hash) }
else
false
end
end
With this approach, examples you provided works correctly.
Let me know if this is acceptable solution and should I provide patch.
Marc-Andre Lafortune wrote:
That's not going to cut it... There are many other counter examples....
s = Set.new t = Set.new s << [s] t << [t] s == t # => false
or similarly:
s = Set.new t = Set.new a = Set.new([s]) b = Set.new([t]) s << a t << b s == t # => false
Updated by marcandre (Marc-Andre Lafortune) almost 8 years ago
Piotr Szmielew wrote:
In this particular example even manually rehashing internal hash doesn't quite work...
Right, looks like hash lookup (internally st_lookup
) doesn't work with recursive values :-(
There is another solution - when comparing sets, use hash method calculate hashes and compare them. Code would look like this:
Comparing hashes is not sufficient for equality. You can have collisions, in particular recursive structures have the same hash... Moreover that code would be too slow to be acceptable.
I'm afraid there is not enough use for recursive Set
s to have figure out a solution...
Updated by Esse (Piotr Szmielew) almost 8 years ago
Also documentation at top of file says:
# * Set assumes that the identity of each element does not change
# while it is stored. Modifying an element of a set will render the
# set to an unreliable state.
creating recursive sets definitely change element while it is stored. So I believe this is not really a bug, simply unexpected behaviour that actually is quite consistent with documentation.
Marc-Andre Lafortune wrote:
Piotr Szmielew wrote:
In this particular example even manually rehashing internal hash doesn't quite work...
Right, looks like hash lookup (internally
st_lookup
) doesn't work with recursive values :-(There is another solution - when comparing sets, use hash method calculate hashes and compare them. Code would look like this:
Comparing hashes is not sufficient for equality. You can have collisions, in particular recursive structures have the same hash... Moreover that code would be too slow to be acceptable.
I'm afraid there is not enough use for recursive
Set
s to have figure out a solution...
Updated by knu (Akinori MUSHA) about 7 years ago
- Status changed from Open to Closed
This is not a bug but a feature. Set#reset has been added so you can use it if need be.