Bug #7715
closedLazy enumerators should want to stay lazy.
Description
I'm just waking up to the fact that many methods turn a lazy enumerator in a non-lazy one.
Here's an example from Benoit Daloze in [ruby-core:44151]:
lines = File.foreach('a_very_large_file').lazy
.select {|line| line.length < 10 }
.map {|line| line.chomp!; line }
.each_slice(3)
.map {|lines| lines.join(';').downcase }
.take_while {|line| line.length > 20 }
That code will produce the right result but will read the whole file, which is not what is desired
Indeed, each_slice
currently does not return a lazy enumerator :-(
To make the above code as intended, one must call .lazy
right after the each_slice(3)
. I feel this is dangerous and counter intuitive.
Is there a valid reason for this behavior? Otherwise, I would like us to consider returning a lazy enumerator for the following methods:
(when called without a block)
each_with_object
each_with_index
each_slice
each_entry
each_cons
(always)
chunk
slice_before
The arguments are:
- fail early (much easier to realize one needs to call a final
force
,to_a
oreach
than realizing that a lazy enumerator chain isn't actually lazy) - easier to remember (every method normally returning an enumerator returns a lazy enumerator). basically this makes Lazy covariant
- I'd expect that if you get lazy at some point, you typically want to remain lazy until the very end
Updated by marcandre (Marc-Andre Lafortune) almost 12 years ago
- Status changed from Open to Assigned
- Assignee set to marcandre (Marc-Andre Lafortune)
I can do it, unless there are objections.
Updated by shugo (Shugo Maeda) almost 12 years ago
marcandre (Marc-Andre Lafortune) wrote:
I can do it, unless there are objections.
Your proposal sounds reasonable.
I guess these methods were forgotten to change when lazy was implemented.
Updated by yhara (Yutaka HARA) almost 12 years ago
shugo (Shugo Maeda) wrote:
I guess these methods were forgotten to change when lazy was implemented.
That's right :-( I thought these methods does not need to be overriden
because they return Enumerator, but they should return Enumerator::Lazy for such cases.
Updated by marcandre (Marc-Andre Lafortune) almost 12 years ago
I believe I have found the key to resolve this issue, Lazy.new issue [#7248] and others.
We simply need to specialize to_enum/enum_for
for lazy enumerators.
In the same way, RETURN_SIZED_ENUMERATOR should return a lazy enumerator, when called for a lazy enumerator.
With this in mind:
- Lazy.each_with_object, etc..., will correctly return lazy enumerators [#7715] without being overriden.
- Lazy#cycle can be removed. It no longer needs to be overriden.
- Lazy.new really has no need to accept (method, *args) and can be modified as proposed in [#7248]
- Any user method of Enumerable that returns an Enumerator using
to_enum
will conserve laziness.
None of this could create a regression, since Lazy & RETURN_SIZED_ENUMERATOR are both new to 2.0.0
I'm working on a patch...
Updated by marcandre (Marc-Andre Lafortune) over 11 years ago
Patch almost done, which also fixes #7248
https://github.com/marcandre/ruby/compare/marcandre:master...marcandre:lazy
Still missing:
- tweak inspect
- fix .lazy.size
- couple more tests
Updated by marcandre (Marc-Andre Lafortune) over 11 years ago
Patch updated, rdoc improved too.
Makes for a clean API for Lazy#new also, and there's even less code (~20 loc).
I'll review the patch one last time before committing it (in about 5 hours).
Updated by marcandre (Marc-Andre Lafortune) over 11 years ago
- Status changed from Assigned to Closed
- % Done changed from 0 to 100
This issue was solved with changeset r39058.
Marc-Andre, thank you for reporting this issue.
Your contribution to Ruby is greatly appreciated.
May Ruby be with you.
-
enumerator.c: Use to_enum for Enumerable methods returning Enumerators.
This makes Lazy#cycle no longer needed, so it was removed.
Make Enumerator#chunk and slice_before return lazy Enumerators.
[Bug #7715] -
internal.h: Remove ref to rb_enum_cycle_size; no longer needed
-
enum.c: Make enum_cycle_size static.
-
test/ruby/test_lazy_enumerator.rb: Test for above