Bug #15027
closedWhen Struct#each method is overriden Struct#select and Struct#to_a use wrong collections
Added by bruno (Bruno Sutic) about 7 years ago. Updated about 4 years ago.
Description
Bug¶
Here's the code snippet that should reproduce the problem:
class Foo < Struct.new(:bar)
  def each(&block)
    [:baz, :qux].each(&block)
  end
end
foo = Foo.new(:foo)
foo.map(&:itself) # => [:baz, :qux]  # OK
foo.to_a # => [:foo]  # NOT OK, expected [:baz, :qux]
foo.select(&:itself) # => [:foo]  # NOT OK, expected [:baz, :qux]
As you can see, even tho we defined another collection for use by overriding #each, the to_a and select still use Struct's original collection.
The problem seem to be with Struct#to_a and Struct#select methods from struct.c file that are defined unnecessarily.
Proposed solution¶
The attached solution simply deletes Struct#select and Struct#to_a. A couple tests are added to show everything still works as before.
Please let me know if I can provide any more info and I'll be ready to do so.
Files
| struct_enumerable_fix.patch (2.8 KB) struct_enumerable_fix.patch | bruno (Bruno Sutic), 08/25/2018 05:29 PM | ||
| struct_enumerable_fix2.patch (1.51 KB) struct_enumerable_fix2.patch | Updated patch | bruno (Bruno Sutic), 09/01/2018 04:04 PM | 
        
           Updated by matz (Yukihiro Matsumoto) about 7 years ago
          
          
        
        
          
            Actions
          
          #1
            [ruby-core:88647]
          Updated by matz (Yukihiro Matsumoto) about 7 years ago
          
          
        
        
          
            Actions
          
          #1
            [ruby-core:88647]
        
      
      - Status changed from Open to Feedback
The proposed behavior is more consistent but slower. I am not sure it's a good idea to hinder performance by supporting consistency in the rare case. Any opinion?
Matz.
        
           Updated by jeremyevans0 (Jeremy Evans) about 7 years ago
          
          
        
        
          
            Actions
          
          #2
            [ruby-core:88648]
          Updated by jeremyevans0 (Jeremy Evans) about 7 years ago
          
          
        
        
          
            Actions
          
          #2
            [ruby-core:88648]
        
      
      I'm against changing the current behavior.  If you can override each, you can override other methods.  If we wanted to be consistent about making this change, we would probably need to remove most the following additional core methods so that the Enumerable implementation would be used:
Array: any?, collect, count, cycle, drop, drop_while, find_index, first, include?, map, max, min, reject, reverse_each, select, sort, sum, take, take_while, to_a, to_h, uniq, zip
Hash: any?, include?, member?, to_a, to_h
Range: first, include?, max, member?, min
Struct: to_h
Above results generated with the following code (which includes some additional classes and methods):
classes = []
meths = Enumerable.instance_methods(false) - [:each]
ObjectSpace.each_object(Class){|c| classes << c if c < Enumerable && c.name}
classes.sort_by(&:name).each do |c|
  same = c.instance_methods(false) & meths
  next if same.empty?
  puts "#{c}: #{same.sort.join(', ')}"
end
I'm guessing the standard library would need additional changes (e.g. set).
If anyone really wants the default Enumerable behavior for Struct:
Struct.remove_method(:to_a, :to_h, :select)
        
           Updated by shevegen (Robert A. Heiler) about 7 years ago
          
          
        
        
          
            Actions
          
          #3
            [ruby-core:88651]
          Updated by shevegen (Robert A. Heiler) about 7 years ago
          
          
        
        
          
            Actions
          
          #3
            [ruby-core:88651]
        
      
      Since matz asked for feedback, just a comment - I have no particular pro or con opinion
per se, mostly because I very rarely use the Struct/OpenStruct family; I usually just end
up writing a "real" class instead and adapt it to what is necessary. The only suggestion
I would like to make in regards to Struct is to mention this in the documentation of Struct (
https://ruby-doc.org/core-2.5.1/Struct.html ) so that it may not surprise ruby users who
use Struct; could also mention Jeremy's example of using the default Enumerable behaviour
for Struct.
        
           Updated by Hanmac (Hans Mackowiak) about 7 years ago
          
          
        
        
          
            Actions
          
          #4
            [ruby-core:88654]
          Updated by Hanmac (Hans Mackowiak) about 7 years ago
          
          
        
        
          
            Actions
          
          #4
            [ruby-core:88654]
        
      
      as a middle way, can't we just do the "is overwritten by user" check?
i think i have seen it on other classes like Array or Hash
so it checks if the each method is overwritten, in that case, call the Enumerable method,
if not, call the Struct own one.
        
           Updated by nobu (Nobuyoshi Nakada) about 7 years ago
          
          
        
        
          
            Actions
          
          #5
            [ruby-core:88733]
          Updated by nobu (Nobuyoshi Nakada) about 7 years ago
          
          
        
        
          
            Actions
          
          #5
            [ruby-core:88733]
        
      
      diff --git a/struct.c b/struct.c
index 7de46980aa..e4c875b5be 100644
--- a/struct.c
+++ b/struct.c
@@ -860,6 +860,9 @@ rb_struct_inspect(VALUE s)
 static VALUE
 rb_struct_to_a(VALUE s)
 {
+    if (!rb_method_basic_definition_p(CLASS_OF(s), idEach)) {
+        return rb_call_super(0, 0);
+    }
     return rb_ary_new4(RSTRUCT_LEN(s), RSTRUCT_CONST_PTR(s));
 }
 
@@ -1077,6 +1080,9 @@ rb_struct_select(int argc, VALUE *argv, VALUE s)
 
     rb_check_arity(argc, 0, 0);
     RETURN_SIZED_ENUMERATOR(s, 0, 0, struct_enum_size);
+    if (!rb_method_basic_definition_p(CLASS_OF(s), idEach)) {
+        return rb_call_super(argc, argv);
+    }
     result = rb_ary_new();
     for (i = 0; i < RSTRUCT_LEN(s); i++) {
 	if (RTEST(rb_yield(RSTRUCT_GET(s, i)))) {
        
           Updated by bruno (Bruno Sutic) about 7 years ago
          
          
        
        
          
            Actions
          
          #6
            [ruby-core:88801]
          Updated by bruno (Bruno Sutic) about 7 years ago
          
          
        
        
          
            Actions
          
          #6
            [ruby-core:88801]
        
      
      Thank you for the code @nobu (Nobuyoshi Nakada). I have applied Nobu's suggestion, please find updated patch in the attach.
If you can override each, you can override other methods.
This is true, but doing this is very un-ruby-ish. Also, I don't think an average ruby developer will be patient enough to investigate why is this happening. A more realistic scenario is that a developer will just implement a custom #select or #to_a method.
        
           Updated by Hanmac (Hans Mackowiak) about 7 years ago
          
          
        
        
          
            Actions
          
          #7
            [ruby-core:88818]
          Updated by Hanmac (Hans Mackowiak) about 7 years ago
          
          
        
        
          
            Actions
          
          #7
            [ruby-core:88818]
        
      
      @matz (Yukihiro Matsumoto): is the patch from Nobu good enough?
        
           Updated by bruno (Bruno Sutic) about 7 years ago
          
          
        
        
          
            Actions
          
          #8
            [ruby-core:89173]
          Updated by bruno (Bruno Sutic) about 7 years ago
          
          
        
        
          
            Actions
          
          #8
            [ruby-core:89173]
        
      
      Hi,
let me know if you have any further feedback on this one?
        
           Updated by mame (Yusuke Endoh) almost 6 years ago
          
          
        
        
          
            Actions
          
          #9
            [ruby-core:96154]
          Updated by mame (Yusuke Endoh) almost 6 years ago
          
          
        
        
          
            Actions
          
          #9
            [ruby-core:96154]
        
      
      - Status changed from Feedback to Assigned
- Assignee set to matz (Yukihiro Matsumoto)
Matz, do you accept the suggested patch?
To be honest, I'm negative against this patch.  It brings false consistency.  The current behavior is very easy to explain; Struct#select is just overridden.  A redefinition check is a magic.
        
           Updated by Dan0042 (Daniel DeLorme) almost 6 years ago
          
          
        
        
          
            Actions
          
          #10
            [ruby-core:96164]
          Updated by Dan0042 (Daniel DeLorme) almost 6 years ago
          
          
        
        
          
            Actions
          
          #10
            [ruby-core:96164]
        
      
      In the case that a method is overridden only for optimization's sake, the override check is simply the equivalent of deoptimizing when a required constraint doesn't hold. This is pretty common in a VM right?
        
           Updated by mame (Yusuke Endoh) almost 6 years ago
          
          
        
        
          
            Actions
          
          #11
            [ruby-core:96178]
          Updated by mame (Yusuke Endoh) almost 6 years ago
          
          
        
        
          
            Actions
          
          #11
            [ruby-core:96178]
        
      
      In general, optimization should be invisible from users (except performance). This kind of redefinition checks in VM are carefully designed to be invisible.
Apparently, Struct#select is defined for the sake of performance improvement.  But IMO, it is not an optimization because it is still visible from users, even after this hack is introduced.
class Foo < Struct.new(:bar)
end
class Bar
  include Enumerable
end
Bar.instance_method(:select).bind_call([:ok]) {|x| p x } #=> :ok
Foo.instance_method(:select).bind_call([:ok]) {|x| p x }
  #=> bind argument must be an instance of Struct (TypeError)
If there is no redefinition check, we can easily explain the current situation; it is a normal override.
That being said, I agree that the current behavior looks inconsistent. I don't like the patch, but I'm not against it.
        
           Updated by matz (Yukihiro Matsumoto) about 4 years ago
          
          
        
        
          
            Actions
          
          #12
            [ruby-core:105284]
          Updated by matz (Yukihiro Matsumoto) about 4 years ago
          
          
        
        
          
            Actions
          
          #12
            [ruby-core:105284]
        
      
      - Status changed from Assigned to Rejected
The methods may be overridden in the subclass. The assumption in this report seems to be too much.
Rejected.
Matz.