Bug #14908
closedEnumerator::Lazy creates unnecessary Array objects.
Description
Benchmark result on trunk:
user system total real
Lazy: 0.120000 0.000000 0.120000 ( 0.119958)
Normal: 0.056000 0.004000 0.060000 ( 0.062848)
2.142857 0.000000 NaN ( 1.908698)
++++++++++++++++++++++++++++++++++++++++++++++++++
Lazy:
Total allocated: 122240 bytes (3033 objects)
Total retained: 0 bytes (0 objects)
allocated memory by class
--------------------------------------------------
120480 Array
880 Proc
384 Enumerator::Lazy
264 Object
168 Enumerator::Generator
64 Enumerator::Yielder
allocated objects by class
--------------------------------------------------
3012 Array
11 Proc
3 Enumerator::Generator
3 Enumerator::Lazy
3 Object
1 Enumerator::Yielder
++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++
Normal:
Total allocated: 72120 bytes (3 objects)
Total retained: 0 bytes (0 objects)
allocated memory by class
--------------------------------------------------
72120 Array
allocated objects by class
--------------------------------------------------
3 Array
++++++++++++++++++++++++++++++++++++++++++++++++++
As you may observe an extra array is created for every final element. Enumerator::Yielder#yield method has arity of -1 which wrap every elements in array. The same goes for Enumerator::Yielder#<< method, I'm proposing to change arity for Enumerator::Yielder#<< to 1 from -1 and use it internally for the lazy enum. It will also make << method definition consistent with other classes(Array, String & etc).
I've applied the following set of changes:
https://github.com/ruby/ruby/pull/1912
which result in the following:
user system total real
Lazy: 0.108000 0.000000 0.108000 ( 0.108484)
Normal: 0.052000 0.008000 0.060000 ( 0.062528)
2.076923 0.000000 NaN ( 1.734961)
++++++++++++++++++++++++++++++++++++++++++++++++++
Lazy:
Total allocated: 2240 bytes (33 objects)
Total retained: 0 bytes (0 objects)
allocated memory by class
--------------------------------------------------
880 Proc
480 Array
384 Enumerator::Lazy
264 Object
168 Enumerator::Generator
64 Enumerator::Yielder
allocated objects by class
--------------------------------------------------
12 Array
11 Proc
3 Enumerator::Generator
3 Enumerator::Lazy
3 Object
1 Enumerator::Yielder
++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++
Normal:
Total allocated: 72120 bytes (3 objects)
Total retained: 0 bytes (0 objects)
allocated memory by class
--------------------------------------------------
72120 Array
allocated objects by class
--------------------------------------------------
3 Array
++++++++++++++++++++++++++++++++++++++++++++++++++
This changes reduces the memory utilization and also by a tiny fraction improves performance for the lazy enumerator
Files
Updated by chopraanmol1 (Anmol Chopra) over 6 years ago
- Description updated (diff)
Updated by chopraanmol1 (Anmol Chopra) over 6 years ago
- Description updated (diff)
Updated by shyouhei (Shyouhei Urabe) over 6 years ago
Understand the problem. The proposed fix however involves spec change. We need to discuss effects of it before applying this patch.
Updated by chopraanmol1 (Anmol Chopra) over 6 years ago
shyouhei (Shyouhei Urabe) wrote:
Understand the problem. The proposed fix however involves spec change. We need to discuss effects of it before applying this patch.
If this is a big breaking change than alternate will be to create a different method for Enumerator::Lazy's internal use. I'm also up for updating the patch to reflect that. But I think for future release it will make more sense to have Enumerator::Yielder#<< to have an arity of 1 if you consider syntax use case.
But for the backporting purpose, I'm more inclined to create a new method.
Updated by Eregon (Benoit Daloze) over 6 years ago
Changing Enumerator::Yielder#<< to have arity 1 seems fine to me, as I guess nobody calls << on an Enumerator::Yielder with more than 1 argument, isn't it?
Updated by chopraanmol1 (Anmol Chopra) over 6 years ago
Eregon (Benoit Daloze) wrote:
Changing Enumerator::Yielder#<< to have arity 1 seems fine to me, as I guess nobody calls << on an Enumerator::Yielder with more than 1 argument, isn't it?
Yes, that will be the general case. Exception:
.send(:<<,...)
.<<(...)
Updated by marcandre (Marc-Andre Lafortune) over 6 years ago
Indeed, as long as Yielder#yield
is kept with arity -1 (as in this patch), indeed I don't think that would be an "incompatibility" we should worry about.
Updated by chopraanmol1 (Anmol Chopra) over 6 years ago
- Description updated (diff)
Updated by chopraanmol1 (Anmol Chopra) over 6 years ago
- Description updated (diff)
Updated by knu (Akinori MUSHA) about 6 years ago
I'm the original author of Enumerator & Yielder and I don't think I particularly intended to make <<
accept many arguments. <<
was an alias of yield
, and I just didn't bother to make a separate function. So, I'd say go ahead.
Updated by chopraanmol1 (Anmol Chopra) about 6 years ago
All the responses for the proposed patch are positive till date. @shyouhei (Shyouhei Urabe) can we go ahead with this now?
Updated by shyouhei (Shyouhei Urabe) about 6 years ago
No objection. I guess @nobu (Nobuyoshi Nakada) is trying to fix it.
Updated by nobu (Nobuyoshi Nakada) about 6 years ago
- Status changed from Open to Closed
Applied in changeset trunk|r64770.
Lazy Enumerator reduce intermediate array creation
[ruby-core:87907] [Bug #14908] [Fix GH-1912]
From: Anmol Chopra chopraanmol1@gmail.com