Actions

Copy link

Feature #6183

closed

Enumerator::Lazy performance issue

Added by gregolsen (Innokenty Mikhailov) over 13 years ago. Updated almost 9 years ago.

Status:

Closed

Assignee:

nobu (Nobuyoshi Nakada)

Target version:

[ruby-core:43529]

Description

I benchmarked Enumerator::Lazy and that's what I got:
user system total real
Lazy: 0.690000 0.010000 0.700000 ( 0.733160)
Normal: 0.160000 0.010000 0.170000 ( 0.186695)

It seems like even with 4 chain links and 3000 elements in initial array, Lazy enumerator is almost 4(!) times slower than the normal case.

Instead of performance benefit we've got 4 times performance drawback.

See test file attached.

Files

Download all files

lazy_test.rb (428 Bytes) lazy_test.rb		gregolsen (Innokenty Mikhailov), 03/21/2012 10:24 PM
bench.rb (389 Bytes) bench.rb	benchmark used for measuring	gregolsen (Innokenty Mikhailov), 04/03/2012 05:43 PM
new_bench.rb (957 Bytes) new_bench.rb		gregolsen (Innokenty Mikhailov), 05/16/2012 05:15 PM
16.05.2013.diff (23.8 KB) 16.05.2013.diff		gregolsen (Innokenty Mikhailov), 05/17/2013 05:48 AM
26_07_2013.diff (24.6 KB) 26_07_2013.diff		gregolsen (Innokenty Mikhailov), 06/26/2013 07:20 PM
26_06_2013_2.diff (24.4 KB) 26_06_2013_2.diff		gregolsen (Innokenty Mikhailov), 06/27/2013 05:24 AM

Related issues 1 (0 open — 1 closed)

Actions

Copy link

#1 [ruby-core:43536]

Updated by trans (Thomas Sawyer) over 13 years ago

Granted that seems like a bit too much overhead, but the advantage to lazy is not in a 1-to-1 comparison with non-lazy. Try removing the .each call at the end and adding .take(10) instead.

Actions

Copy link

#2 [ruby-core:43537]

Updated by mame (Yusuke Endoh) over 13 years ago

Status changed from Open to Rejected

Hello,

Enumerator::Lazy is not a silver bullet; it removes the overhead for
creating an intermediate array, but brings the drawback for calling
a block. Unfortunately, the latter is much bigger than the former.
Thus, in general, Lazy does bring performance drawback.

The worth of Enumerator::Lazy is to extract first some elements from
big array, especially, infinite sequence. For example:

Prime.lazy.select {|x| x % 4 == 3 }.take(10).to_a

The code becomes much complex without Lazy:

a = []
Prime.each do |x|
next if x % 4 != 3
a << x
break if a.size == 10
end

Anyway, this is not a bug. If you have any concrete idea to "fix"
this issue, please reopen the ticket.

Thank you,

--
Yusuke Endoh mame@tsg.ne.jp

Actions

Copy link

#3 [ruby-core:44101]

Updated by gregolsen (Innokenty Mikhailov) over 13 years ago

File bench.rb bench.rb added
File lazy_enumerator.diff added
File code added

=begin
Finally come up with a concrete idea how to "fix" (based on my first PR https://github.com/ruby/ruby/pull/100).

The idea is to keep all blocks (passed with lazy methods like map or select) as (({Proc})) objects inside the enumerator and apply them one by one when value requested ((({to_a})), (({next})), etc)
This strategy avoids enumerator chaining on each lazy method call and eliminates fair amount of 'calling the block' with (({rb_block_call})) operations.
Here's benchmark results:

2.0.0| ~/projects/ruby(trunk)$ rvm ruby-head
2.0.0| ~/projects/ruby(trunk)$ ruby bench.rb
user system total real
Lazy enumerator 1.460000 0.000000 1.460000 ( 1.465739)
Simple array 0.420000 0.000000 0.420000 ( 0.421446)
0.287671 NaN NaN ( 0.287531)
2.0.0| ~/projects/ruby(trunk)$ rvm system
2.0.0| ~/projects/ruby(trunk)$ ruby bench.rb
user system total real
Lazy enumerator 0.770000 0.000000 0.770000 ( 0.764750)
Simple array 0.370000 0.000000 0.370000 ( 0.382653)
0.480519 NaN NaN ( 0.500364)

ruby-head is current trunk compiled, and system ruby - is the same trunk but with my patch applied.
Last row in results is ratio between 'Simple array' time and 'Lazy Enumerator' time.
So, as you can see, with this patch lazy enumerator becomes almost 2 times faster.

It's a 'proof of concept' patch (only map and select added) - let me know if it makes sense.
I believe that using this approach and with your help lazy enumerator performance can be improved significantly.

I'm attaching the diff along with the main part of the source code just in case it's hard to follow the diff.

=end

Actions

Copy link

#4 [ruby-core:44102]

Updated by matz (Yukihiro Matsumoto) over 13 years ago

Status changed from Rejected to Assigned
Assignee set to nobu (Nobuyoshi Nakada)
Priority changed from Normal to 3

Nobu, could you review the patch? And if you don't see any problem, check it in?

Matz.

Actions

Copy link

#5 [ruby-core:44106]

Updated by mame (Yusuke Endoh) over 13 years ago

Hello,

matz (Yukihiro Matsumoto) wrote:

Nobu, could you review the patch? And if you don't see any problem, check it in?

I didn't read the patch, but I found a weird behavior.
I think the approach is interesting, though.

$ cat t.rb
a = (1..10).lazy
p a.map {|x| x + 1 }.to_a
p a.map {|x| x + 1 }.to_a
p a.map {|x| x + 1 }.to_a

without the patch¶

$ ./miniruby.org t.rb
[2, 3, 4, 5, 6, 7, 8, 9, 10, 11]
[2, 3, 4, 5, 6, 7, 8, 9, 10, 11]
[2, 3, 4, 5, 6, 7, 8, 9, 10, 11]

with the patch applied¶

$ ./miniruby.new t.rb
[2, 3, 4, 5, 6, 7, 8, 9, 10, 11]
[3, 4, 5, 6, 7, 8, 9, 10, 11, 12]
[4, 5, 6, 7, 8, 9, 10, 11, 12, 13]

--
Yusuke Endoh mame@tsg.ne.jp

Actions

Copy link

#6 [ruby-core:44110]

Updated by gregolsen (Innokenty Mikhailov) over 13 years ago

That's because each time you mapping lazy enumerator another proc objected added to procs array, so in your example you effectively mapping 3 times.
I should return new enumerator object rather than modifying existing one while calling lazy map or select (or whatever lazy method).

A lot of work should be done to finish this patch: all other lazy methods should be added.
Also I'm getting an error while working with big arrays (> 10^4).
But if you are all positive about the approach I'll happily proceed and do my best to make this fully work.

Actions

Copy link

#7 [ruby-core:44112]

Updated by trans (Thomas Sawyer) over 13 years ago

In response to #6250, wouldn't the simplest implementation be to return a new Lazy Enumerator for each lazy call?

See https://github.com/rubyworks/facets/blob/master/lib/core/facets/denumerable.rb for the basic idea.

Maybe storing each block internally in an Enumerator might be a tad more efficient, but I would think the added complexity to Enumerator state would make it not worth it.

Actions

Copy link

#8 [ruby-core:44124]

Updated by gregolsen (Innokenty Mikhailov) over 13 years ago

File new_lazy_enumerator.diff added

Here's the new patch attached - problem, mentioned by Yusuke Endoh, fixed - now I'm creating a new copy of enumerator on each lazy method call.
Also I fixed an error for big arrays - forgot to gc_mark procs array.

Thomas, that's the point - current implementation is very simple and hence very inefficient.
It mimics ruby implementations but as soon as we are in the C sources already - we can come up with something more efficient.

Actions

Copy link

#9 [ruby-core:44125]

Updated by mame (Yusuke Endoh) over 13 years ago

Hello,

2012/4/4 gregolsen (Innokenty Mikhailov) anotheroneman@yahoo.com:

Here's the new patch attached - problem, mentioned by Yusuke Endoh, fixed - now I'm creating a new copy of enumerator on each lazy method call.

Okay, the next problem :-)

(1..10).lazy.select {|x| false }.map {|x| p x }.to_a

should print nothing, but it actually prints 1, 2, ..., 10
with your patch applied. It can be fixed easily, though.

I glanced your patch. It will degrade functional modularity
in enumerator.c. Currently, it not so big problem because
it only implements #map and #select. But I guess implementing
other methods, especially, #cycle and #zip, will make some
functions (process_element and lazy_iterator_block) complex
and hard to maintain.

Thus, until you create the final patch, it is hard to say
whether we can import your patch or not.

--
Yusuke Endoh mame@tsg.ne.jp

Actions

Copy link

#10 [ruby-core:44142]

Updated by nobu (Nobuyoshi Nakada) over 13 years ago

Hi,

(12/04/04 20:59), Yusuke Endoh wrote:

I glanced your patch. It will degrade functional modularity
in enumerator.c. Currently, it not so big problem because
it only implements #map and #select. But I guess implementing
other methods, especially, #cycle and #zip, will make some
functions (process_element and lazy_iterator_block) complex
and hard to maintain.

Agree.

Naturally, this approach which chains lazy enumerator processes
directly should be faster than current one. So I want to see this
being merged in an extensible way.

Thus, until you create the final patch, it is hard to say
whether we can import your patch or not.

And, do not comment out existing code with //. It unnecessarily
increases noise in the patch.

--
Nobu Nakada

Actions

Copy link

#11 [ruby-core:44186]

Updated by gregolsen (Innokenty Mikhailov) over 13 years ago

File lazy_enumerator_hybrid.diff added

(Yusuke Endoh) wrote:

(1..10).lazy.select {|x| false }.map {|x| p x }.to_a
should print nothing, but it actually prints 1, 2, ..., 10

Fixed, thanks.

But I guess implementing
other methods, especially, #cycle and #zip, will make some
functions (process_element and lazy_iterator_block) complex
and hard to maintain.

Agree, those methods (especially #cycle) will be hard to implement in terms of procs chaining approach.

(Nobu Nakada) wrote:

So I want to see this being merged in an extensible way.

I came up with a new hybrid patch.
It uses procs chaining to handle lazy #map and #select, and current enumerator chaining approach for other methods.
I believe this is an extensible way. We can move forward step by step and we can stop any time.
If those tricky #zip and #cycle methods optimization won't worth the code complexity added - we can leave it as it is: based on enumerator chaining.
See new lazy_enumerator_hybrid.diff (tests are green except test_inspect).

Please, let me know you thoughts about this.

Actions

Copy link

#12 [ruby-core:44393]

Updated by Anonymous about 13 years ago

It's a 'proof of concept' patch (only map and select added) - let me know if it makes sense.
I believe that using this approach and with your help lazy enumerator performance can be improved significantly.

That would be awesome.
-r

Actions

Copy link

#13 [ruby-core:45076]

Updated by gregolsen (Innokenty Mikhailov) about 13 years ago

File 16_may.diff added
File new_bench.rb new_bench.rb added

=begin
Here's an update.
All methods except (({Lazy#cycle})), (({Lazy#zip})) and (({Lazy#flat_map})) are optimized.
Benchmark results shown below.

I was working in ((<this branch|URL:https://github.com/gregolsen/ruby/tree/lazy_enumerator_optimization>)).
Cycle, zip and flat_map are really tough to convert to procs chaining, however they are working fine in this hybrid solution and can be leave as it is.

All tests pass except test_inspect.
If this implementation is acceptable then the next step will be to fix inspect and add more tests to cover different types of chaining:
like chaining of enumerator chained (cycle, zip, flat_map) methods with procs chained optimized methods.

Please, let me know your thoughts.

Thanks.

======================Lazy#map
user system total real
Trunk 1.420000 0.010000 1.430000 ( 1.461111)
Optimized 0.830000 0.000000 0.830000 ( 0.833560)
======================Lazy#select
user system total real
Trunk 1.270000 0.010000 1.280000 ( 1.274074)
Optimized 0.780000 0.000000 0.780000 ( 0.785303)
======================Lazy#grep
user system total real
Trunk 1.540000 0.010000 1.550000 ( 1.552651)
Optimized 0.780000 0.000000 0.780000 ( 0.783526)
======================Lazy#take_while
user system total real
Trunk 1.260000 0.000000 1.260000 ( 1.257465)
Optimized 0.780000 0.010000 0.790000 ( 0.784682)
======================Lazy#drop_while
user system total real
Trunk 1.030000 0.000000 1.030000 ( 1.030987)
Optimized 0.400000 0.000000 0.400000 ( 0.394112)
======================Lazy#reject
user system total real
Trunk 1.240000 0.000000 1.240000 ( 1.243565)
Optimized 0.780000 0.000000 0.780000 ( 0.781825)
======================Lazy#drop
user system total real
Trunk 4.150000 0.000000 4.150000 ( 4.159758)
Optimized 1.590000 0.000000 1.590000 ( 1.598785)
======================Lazy#take
user system total real
Trunk 0.520000 0.000000 0.520000 ( 0.517902)
Optimized 0.010000 0.000000 0.010000 ( 0.003274)
=end

Actions

Copy link

#14 [ruby-core:46506]

Updated by gregolsen (Innokenty Mikhailov) almost 13 years ago

File 2012-07-14.diff added

Here's an update:
Finally I've fixed the last test for Enumerator::Lazy#inspect - now it supports procs chaining too.
Nobuyoshi Nakada, please, see diff attached and let me know your thoughts about this.
Thanks.

Actions

Copy link

#15 [ruby-core:46787]

Updated by mame (Yusuke Endoh) almost 13 years ago

Your patch makes the following code stuck.

p [1,2,3].to_enum.lazy.cycle.take(10).to_a

It should print [1, 2, 3, 1, 2, 3, 1, 2, 3, 1].

--
Yusuke Endoh mame@tsg.ne.jp

Actions

Copy link

#16 [ruby-core:46894]

Updated by gregolsen (Innokenty Mikhailov) almost 13 years ago

File 31_july.diff added

Yusuke Endoh, thanks a lot for pointing out on this issue.
Fixed, please see new diff attached.

Actions

Copy link

#17 [ruby-core:49549]

Updated by zzak (zzak _) over 12 years ago

Assignee changed from nobu (Nobuyoshi Nakada) to mame (Yusuke Endoh)

Yusuke-san seems to be last to review this, what is your opinion?

Actions

Copy link

#18 [ruby-core:50523]

Updated by gregolsen (Innokenty Mikhailov) over 12 years ago

File December_3.diff added

I've merged the patch with latest trunk (see latest diff attached), specifically with Enumerator lazy size feature.
Also I've removed the ugly case switch: now proc entry stores pointer to a function that is executed when iterating over the elements.
So now it even resembles the current implementation a bit.

Please, let me know your thoughts about all this.
Thanks in advance.

Actions

Copy link

#19

Updated by nobu (Nobuyoshi Nakada) over 12 years ago

Tracker changed from Bug to Feature

Actions

Copy link

#20 [ruby-core:50707]

Updated by mame (Yusuke Endoh) over 12 years ago

Sorry, maybe I have no time to review your patch. Anyone interested?

--
Yusuke Endoh mame@tsg.ne.jp

Actions

Copy link

#21 [ruby-core:54663]

Updated by zzak (zzak _) about 12 years ago

Assignee changed from mame (Yusuke Endoh) to zzak (zzak _)

This patch is big, but I hope to review it soon as I will be working on Lazy soon

Actions

Copy link

#22

Updated by gregolsen (Innokenty Mikhailov) about 12 years ago

Zachary Scott, thanks for your interest. I'm afraid it doesn't merge into trunk cleanly anymore after introducing lazy #size. I'll try to fix it and let you know when ready.

Actions

Copy link

#23 [ruby-core:54693]

Updated by zzak (zzak _) about 12 years ago

@gregolsen Sure, np!

Actions

Copy link

#24 [ruby-core:55019]

Updated by gregolsen (Innokenty Mikhailov) about 12 years ago

File 16.05.2013.diff 16.05.2013.diff added

Finally managed to merge. Please see latest diff attached.

Actions

Copy link

#25 [ruby-core:55026]

Updated by zzak (zzak _) about 12 years ago

@gregolsen Thank you! I will try to review this soon, before you have to rebase again ;)

Mind if I delete the old patches? It might confuse someone looking at the ticket.

Actions

Copy link

#26

Updated by gregolsen (Innokenty Mikhailov) about 12 years ago

Sure, feel free to clean old files. Thanks!

Actions

Copy link

#27

Updated by zzak (zzak _) about 12 years ago

File deleted (~~lazy_enumerator.diff~~)

Actions

Copy link

#28

Updated by zzak (zzak _) about 12 years ago

File deleted (~~code~~)

Actions

Copy link

#29

Updated by zzak (zzak _) about 12 years ago

File deleted (~~new_lazy_enumerator.diff~~)

Actions

Copy link

#30

Updated by zzak (zzak _) about 12 years ago

File deleted (~~lazy_enumerator_hybrid.diff~~)

Actions

Copy link

#31

Updated by zzak (zzak _) about 12 years ago

File deleted (~~16_may.diff~~)

Actions

Copy link

#32

Updated by zzak (zzak _) about 12 years ago

File deleted (~~2012-07-14.diff~~)

Actions

Copy link

#33

Updated by zzak (zzak _) about 12 years ago

File deleted (~~31_july.diff~~)

Actions

Copy link

#34

Updated by zzak (zzak _) about 12 years ago

File deleted (~~December_3.diff~~)

Actions

Copy link

#35 [ruby-core:55219]

Updated by zzak (zzak _) about 12 years ago

@gregolsen Looks like some of this was applied, sorry I haven't had a chance to check it yet.

Could you check?

It may need a rebase, the patch didn't apply cleanly and some of the functions already exist (like append_method())

Actions

Copy link

#36 [ruby-core:55259]

Updated by gregolsen (Innokenty Mikhailov) about 12 years ago

Indeed append_method was exctracted by nobu 2 weeks ago as a refactoring of enumerator_inspect. But that's it, nothing was merged yet. I'm not sure I'll be able to rebase patch in next few weeks - got only android tablet with me. I'll let you know when ready. Thanks a lot for your interest.

Actions

Copy link

#37 [ruby-core:55661]

Updated by gregolsen (Innokenty Mikhailov) about 12 years ago

File 26_07_2013.diff 26_07_2013.diff added

Rebased towards latest trunk.

Actions

Copy link

#38 [ruby-core:55663]

Updated by nobu (Nobuyoshi Nakada) about 12 years ago

When will generator::hybrid become other than Qfalse?

Actions

Copy link

#39 [ruby-core:55665]

Updated by gregolsen (Innokenty Mikhailov) about 12 years ago

File 26_06_2013_2.diff 26_06_2013_2.diff added

@nobu (Nobuyoshi Nakada) thanks for pointing!
Indeed hybrid flag is already obsolete since I'm checking for proc_entry presence.
If no proc entries present - than it's effectively a hybrid case.
Fixed patch attached.

PS
I'm using this github branch - https://github.com/gregolsen/ruby/tree/enumerator_lazy_optimization
Should I still attach diffs here?

Actions

Copy link

#40 [ruby-core:55669]

Updated by nobu (Nobuyoshi Nakada) about 12 years ago

Your patch seems

reverting inspect_enumerator()
containing dead code

I pushed a branch which split and merged a few functions, and used function pointer tables.
https://github.com/nobu/ruby/tree/lazy_enum

Actions

Copy link

#41 [ruby-core:55731]

Updated by gregolsen (Innokenty Mikhailov) about 12 years ago

Indeed I messed up the patch with numerous rebases.
Thanks a lot for your refactoring branch - everything looks great.

So what our next steps?

Actions

Copy link

#42 [ruby-core:55739]

Updated by zzak (zzak _) about 12 years ago

Assignee changed from zzak (zzak _) to nobu (Nobuyoshi Nakada)

Actions

Copy link

#43 [ruby-core:77124]

Updated by fbernier (François Bernier) almost 9 years ago

As of today, Enumerable::Lazy is pretty much still unused because of the performance hit. Is there anything I could do to help getting this in? Is there a paper on that subject I could read about to make improvements?

Actions

Copy link

#44 [ruby-core:77130]

Updated by nobu (Nobuyoshi Nakada) almost 9 years ago

I rebased the branch (and fixed a bug in the trunk).
Seems 30~40% faster than the current implementation.

Actions

Copy link

#45

Updated by nobu (Nobuyoshi Nakada) almost 9 years ago

Status changed from Assigned to Closed

Applied in changeset r56185.

enumerator.c: lazy enum improvement

enumerator.c (lazy_init_yielder): directly call stored functions.
[Feature #6183]
enumerator.c (lazy_add_method): create lazy enumerator which
uses lazy_init_yielder().

Actions

Copy link

Also available in: Atom PDF

Like0

Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0

Project

General

Profile

Ruby

Tags

Custom queries

Feature #6183

Enumerator::Lazy performance issue

Updated by trans (Thomas Sawyer) over 13 years ago

Updated by mame (Yusuke Endoh) over 13 years ago

Updated by gregolsen (Innokenty Mikhailov) over 13 years ago

Updated by matz (Yukihiro Matsumoto) over 13 years ago

Updated by mame (Yusuke Endoh) over 13 years ago

without the patch¶

with the patch applied¶

Updated by gregolsen (Innokenty Mikhailov) over 13 years ago

Updated by trans (Thomas Sawyer) over 13 years ago

Updated by gregolsen (Innokenty Mikhailov) over 13 years ago

Updated by mame (Yusuke Endoh) over 13 years ago

Updated by nobu (Nobuyoshi Nakada) over 13 years ago

Updated by gregolsen (Innokenty Mikhailov) over 13 years ago

Updated by Anonymous about 13 years ago

Updated by gregolsen (Innokenty Mikhailov) about 13 years ago

Updated by gregolsen (Innokenty Mikhailov) almost 13 years ago

Updated by mame (Yusuke Endoh) almost 13 years ago

Updated by gregolsen (Innokenty Mikhailov) almost 13 years ago

Updated by zzak (zzak _) over 12 years ago

Updated by gregolsen (Innokenty Mikhailov) over 12 years ago

Updated by nobu (Nobuyoshi Nakada) over 12 years ago

Updated by mame (Yusuke Endoh) over 12 years ago

Updated by zzak (zzak _) about 12 years ago

Updated by gregolsen (Innokenty Mikhailov) about 12 years ago

Updated by zzak (zzak _) about 12 years ago

Updated by gregolsen (Innokenty Mikhailov) about 12 years ago

Updated by zzak (zzak _) about 12 years ago

Updated by gregolsen (Innokenty Mikhailov) about 12 years ago

Updated by zzak (zzak _) about 12 years ago

Updated by zzak (zzak _) about 12 years ago

Updated by zzak (zzak _) about 12 years ago

Updated by zzak (zzak _) about 12 years ago

Updated by zzak (zzak _) about 12 years ago

Updated by zzak (zzak _) about 12 years ago

Updated by zzak (zzak _) about 12 years ago

Updated by zzak (zzak _) about 12 years ago

Updated by zzak (zzak _) about 12 years ago

Updated by gregolsen (Innokenty Mikhailov) about 12 years ago

Updated by gregolsen (Innokenty Mikhailov) about 12 years ago

Updated by nobu (Nobuyoshi Nakada) about 12 years ago

Updated by gregolsen (Innokenty Mikhailov) about 12 years ago

Updated by nobu (Nobuyoshi Nakada) about 12 years ago

Updated by gregolsen (Innokenty Mikhailov) about 12 years ago

Updated by zzak (zzak _) about 12 years ago

Updated by fbernier (François Bernier) almost 9 years ago

Updated by nobu (Nobuyoshi Nakada) almost 9 years ago

Updated by nobu (Nobuyoshi Nakada) almost 9 years ago