Bug #21375
openSet[] does not call #initialize
Description
I have a subclass of Set that overrides #initialize. Following #21216, .new does call #initialize but .[] does not.
class MySet < Set
def initialize(enum = nil)
compare_by_identity
super
end
end
MySet.new.compare_by_identity?
# => true
MySet[].compare_by_identity?
# => false
Updated by nobu (Nobuyoshi Nakada) 2 days ago
I'm not sure if it is intentional or not.
If unintentional, this patch may fix it.
diff --git a/set.c b/set.c
index 8676c62cd35..c41781c446f 100644
--- a/set.c
+++ b/set.c
@@ -409,13 +409,7 @@ static VALUE
set_s_create(int argc, VALUE *argv, VALUE klass)
{
VALUE set = set_alloc_with_size(klass, argc);
- set_table *table = RSET_TABLE(set);
- int i;
-
- for (i=0; i < argc; i++) {
- set_table_insert_wb(table, set, argv[i], NULL);
- }
-
+ rb_obj_call_init(set, argc, argv);
return set;
}
Updated by jeremyevans0 (Jeremy Evans) 2 days ago
nobu (Nobuyoshi Nakada) wrote in #note-1:
I'm not sure if it is intentional or not.
It was intentional when I developed core Set. Set.[]
takes different arguments than Set#initialize
.
Array.[]
does not call Array#initialize
, and Hash.[]
does not call Hash#initialize
, so there is precedent for the core collection classes to operate this way.
Set.[]
was not documented as calling Set#initialize
previously, and there were no tests or specs for it. Relying on Set.[]
calling Set#initialize
was relying on undefined behavior.
If unintentional, this patch may fix it.
diff --git a/set.c b/set.c index 8676c62cd35..c41781c446f 100644 --- a/set.c +++ b/set.c @@ -409,13 +409,7 @@ static VALUE set_s_create(int argc, VALUE *argv, VALUE klass) { VALUE set = set_alloc_with_size(klass, argc); - set_table *table = RSET_TABLE(set); - int i; - - for (i=0; i < argc; i++) { - set_table_insert_wb(table, set, argv[i], NULL); - } - + rb_obj_call_init(set, argc, argv); return set; }
I'm not sure whether this is correct. #initialize
arguments are enumerables of elements, .[]
arguments are elements. I think if we wanted to call #initialize
, we would have to allocate an array for the elements, and pass that, which would reduce performance.
Updated by Ethan (Ethan -) 1 day ago
It seems like a regression to me. I mean, it broke my code - maybe I'm alone, I can't say whether other people override #initialize and expect Set[] to call it, but it seems like a reasonable assumption to make, particularly since it always has done that. I wasn't aware that Array[]/Hash[] don't call initialize and I find that counterintuitive, and would advocate against changing other classes/constructors to do that.
Set.[]
was not documented as callingSet#initialize
previously
My feeling is that the base expectation is that #initialize is called on new objects. I know constructors can allocate and do things without #initialize, but I think that is quite rare. Maybe I'm wrong, I just learned Array[] and Hash[] do that, but I've almost never seen it.
I'll also note that the effects of not calling #initialize are easy to miss, since the set will still be a perfectly functional set, but will have lost whatever checks or changes are intended on initialization. At least in my case this was not straightforward to trace back to this change, it took me a while in the debugger to figure out that some of my sets were unexpectedly no longer compare_by_identity as my #initialize sets.
Having figured that out it is of course trivial for me to override MySet.[], but I'd rather Set behave like it used to, especially if that would save other people having to debug subtle bugs and fix.