Project

General

Profile

Bug #14908

Enumerator::Lazy creates unnecessary Array objects.

Added by chopraanmol1 (Anmol Chopra) 9 days ago. Updated 7 days ago.

Status:
Open
Priority:
Normal
Assignee:
-
Target version:
-
ruby -v:
ruby 2.6.0dev (2018-07-11 trunk 63949) [x86_64-linux]
[ruby-core:87907]

Description

Benchmark result on trunk:

                 user     system      total        real
Lazy:        0.120000   0.000000   0.120000 (  0.119958)
Normal:      0.056000   0.004000   0.060000 (  0.062848)
             2.142857   0.000000        NaN (  1.908698)
++++++++++++++++++++++++++++++++++++++++++++++++++
Lazy:
Total allocated: 122240 bytes (3033 objects)
Total retained:  0 bytes (0 objects)

allocated memory by class
--------------------------------------------------
    120480  Array
       880  Proc
       384  Enumerator::Lazy
       264  Object
       168  Enumerator::Generator
        64  Enumerator::Yielder

allocated objects by class
--------------------------------------------------
      3012  Array
        11  Proc
         3  Enumerator::Generator
         3  Enumerator::Lazy
         3  Object
         1  Enumerator::Yielder
++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++
Normal:
Total allocated: 72120 bytes (3 objects)
Total retained:  0 bytes (0 objects)

allocated memory by class
--------------------------------------------------
     72120  Array

allocated objects by class
--------------------------------------------------
         3  Array
++++++++++++++++++++++++++++++++++++++++++++++++++

As you may observe an extra array is created for every final element. Enumerator::Yielder#yield method has arity of -1 which wrap every elements in array. The same goes for Enumerator::Yielder#<< method, I'm proposing to change arity for Enumerator::Yielder#<< to 1 from -1. It will also make << method definition consistent with other classes(Array, String & etc).

I've applied the following set of changes to trunk(Pull Request for the same: https://github.com/ruby/ruby/pull/1912):

diff --git a/enumerator.c b/enumerator.c
index 050a9ce58f..c21d6f36f1 100644
--- a/enumerator.c
+++ b/enumerator.c
@@ -103,7 +103,7 @@
  */
 VALUE rb_cEnumerator;
 static VALUE rb_cLazy;
-static ID id_rewind, id_new, id_yield, id_to_enum;
+static ID id_rewind, id_new, id_yield, id_yield_push, id_to_enum;
 static ID id_next, id_result, id_lazy, id_receiver, id_arguments, id_memo, id_method, id_force;
 static VALUE sym_each, sym_cycle;

@@ -1265,9 +1265,14 @@ yielder_yield(VALUE obj, VALUE args)

 /* :nodoc: */
 static VALUE
-yielder_yield_push(VALUE obj, VALUE args)
+yielder_yield_push(VALUE obj, VALUE arg)
 {
-    yielder_yield(obj, args);
+    struct yielder *ptr = yielder_ptr(obj);
+
+    rb_proc_call_with_block(ptr->proc, 1, &arg, Qnil);
+
     return obj;
 }

@@ -1510,7 +1515,7 @@ lazy_init_yielder(VALUE val, VALUE m, int argc, VALUE *argv)
     }

     if (cont) {
- rb_funcall2(yielder, id_yield, 1, &(result->memo_value));
+ rb_funcall2(yielder, id_yield_push, 1, &(result->memo_value));
     }
     if (LAZY_MEMO_BREAK_P(result)) {
  rb_iter_break();
@@ -2448,7 +2453,7 @@ InitVM_Enumerator(void)
     rb_define_alloc_func(rb_cYielder, yielder_allocate);
     rb_define_method(rb_cYielder, "initialize", yielder_initialize, 0);
     rb_define_method(rb_cYielder, "yield", yielder_yield, -2);
-    rb_define_method(rb_cYielder, "<<", yielder_yield_push, -2);
+    rb_define_method(rb_cYielder, "<<", yielder_yield_push, 1);

     rb_provide("enumerator.so"); /* for backward compatibility */
 }
@@ -2459,6 +2464,7 @@ Init_Enumerator(void)
 {
     id_rewind = rb_intern("rewind");
     id_yield = rb_intern("yield");
+    id_yield_push = rb_intern("<<");
     id_new = rb_intern("new");
     id_next = rb_intern("next");
     id_result = rb_intern("result");

which result in the following:

                 user     system      total        real
Lazy:        0.108000   0.000000   0.108000 (  0.108484)
Normal:      0.052000   0.008000   0.060000 (  0.062528)
             2.076923   0.000000        NaN (  1.734961)
++++++++++++++++++++++++++++++++++++++++++++++++++
Lazy:
Total allocated: 2240 bytes (33 objects)
Total retained:  0 bytes (0 objects)

allocated memory by class
--------------------------------------------------
       880  Proc
       480  Array
       384  Enumerator::Lazy
       264  Object
       168  Enumerator::Generator
        64  Enumerator::Yielder

allocated objects by class
--------------------------------------------------
        12  Array
        11  Proc
         3  Enumerator::Generator
         3  Enumerator::Lazy
         3  Object
         1  Enumerator::Yielder
++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++
Normal:
Total allocated: 72120 bytes (3 objects)
Total retained:  0 bytes (0 objects)

allocated memory by class
--------------------------------------------------
     72120  Array

allocated objects by class
--------------------------------------------------
         3  Array
++++++++++++++++++++++++++++++++++++++++++++++++++

This changes reduces the memory utilization and also by a tiny fraction improves performance for the lazy enumerator

lazy_test.rb (1.26 KB) lazy_test.rb Benchmark File chopraanmol1 (Anmol Chopra), 07/11/2018 01:09 PM

History

#1 Updated by chopraanmol1 (Anmol Chopra) 9 days ago

  • Description updated (diff)

#2 Updated by chopraanmol1 (Anmol Chopra) 9 days ago

  • Description updated (diff)

#3 [ruby-core:87914] Updated by shyouhei (Shyouhei Urabe) 9 days ago

Understand the problem. The proposed fix however involves spec change. We need to discuss effects of it before applying this patch.

#4 [ruby-core:87918] Updated by chopraanmol1 (Anmol Chopra) 9 days ago

shyouhei (Shyouhei Urabe) wrote:

Understand the problem. The proposed fix however involves spec change. We need to discuss effects of it before applying this patch.

If this is a big breaking change than alternate will be to create a different method for Enumerator::Lazy's internal use. I'm also up for updating the patch to reflect that. But I think for future release it will make more sense to have Enumerator::Yielder#<< to have an arity of 1 if you consider syntax use case.

But for the backporting purpose, I'm more inclined to create a new method.

#5 [ruby-core:87924] Updated by Eregon (Benoit Daloze) 8 days ago

Changing Enumerator::Yielder#<< to have arity 1 seems fine to me, as I guess nobody calls << on an Enumerator::Yielder with more than 1 argument, isn't it?

#6 [ruby-core:87933] Updated by chopraanmol1 (Anmol Chopra) 8 days ago

Eregon (Benoit Daloze) wrote:

Changing Enumerator::Yielder#<< to have arity 1 seems fine to me, as I guess nobody calls << on an Enumerator::Yielder with more than 1 argument, isn't it?

Yes, that will be the general case. Exception:

.send(:<<,...)
.<<(...)

#7 [ruby-core:87936] Updated by marcandre (Marc-Andre Lafortune) 8 days ago

Indeed, as long as Yielder#yield is kept with arity -1 (as in this patch), indeed I don't think that would be an "incompatibility" we should worry about.

#8 Updated by chopraanmol1 (Anmol Chopra) 7 days ago

  • Description updated (diff)

Also available in: Atom PDF