Bug #7715
closedLazy enumerators should want to stay lazy.
Description
I'm just waking up to the fact that many methods turn a lazy enumerator in a non-lazy one.
Here's an example from Benoit Daloze in [ruby-core:44151]:
lines = File.foreach('a_very_large_file').lazy
.select {|line| line.length < 10 }
.map {|line| line.chomp!; line }
.each_slice(3)
.map {|lines| lines.join(';').downcase }
.take_while {|line| line.length > 20 }
That code will produce the right result but will read the whole file, which is not what is desired
Indeed, each_slice currently does not return a lazy enumerator :-(
To make the above code as intended, one must call .lazy right after the each_slice(3). I feel this is dangerous and counter intuitive.
Is there a valid reason for this behavior? Otherwise, I would like us to consider returning a lazy enumerator for the following methods:
(when called without a block)
each_with_object
each_with_index
each_slice
each_entry
each_cons
(always)
chunk
slice_before
The arguments are:
- fail early (much easier to realize one needs to call a final force,to_aoreachthan realizing that a lazy enumerator chain isn't actually lazy)
- easier to remember (every method normally returning an enumerator returns a lazy enumerator). basically this makes Lazy covariant
- I'd expect that if you get lazy at some point, you typically want to remain lazy until the very end
        
           Updated by ko1 (Koichi Sasada) over 12 years ago
          Updated by ko1 (Koichi Sasada) over 12 years ago
          
          
        
        
      
      - Target version set to 2.0.0
Who's ball?
        
           Updated by marcandre (Marc-Andre Lafortune) over 12 years ago
          Updated by marcandre (Marc-Andre Lafortune) over 12 years ago
          
          
        
        
      
      - Status changed from Open to Assigned
- Assignee set to marcandre (Marc-Andre Lafortune)
I can do it, unless there are objections.
        
           Updated by shugo (Shugo Maeda) over 12 years ago
          Updated by shugo (Shugo Maeda) over 12 years ago
          
          
        
        
      
      marcandre (Marc-Andre Lafortune) wrote:
I can do it, unless there are objections.
Your proposal sounds reasonable.
I guess these methods were forgotten to change when lazy was implemented.
        
           Updated by yhara (Yutaka HARA) over 12 years ago
          Updated by yhara (Yutaka HARA) over 12 years ago
          
          
        
        
      
      shugo (Shugo Maeda) wrote:
I guess these methods were forgotten to change when lazy was implemented.
That's right :-(   I thought these methods does not need to be overriden
because they return Enumerator, but they should return Enumerator::Lazy for such cases.
        
           Updated by marcandre (Marc-Andre Lafortune) over 12 years ago
          Updated by marcandre (Marc-Andre Lafortune) over 12 years ago
          
          
        
        
      
      I believe I have found the key to resolve this issue, Lazy.new issue [#7248] and others.
We simply need to specialize to_enum/enum_for for lazy enumerators.
In the same way, RETURN_SIZED_ENUMERATOR should return a lazy enumerator, when called for a lazy enumerator.
With this in mind:
- Lazy.each_with_object, etc..., will correctly return lazy enumerators [#7715] without being overriden.
- Lazy#cycle can be removed. It no longer needs to be overriden.
- Lazy.new really has no need to accept (method, *args) and can be modified as proposed in [#7248]
- Any user method of Enumerable that returns an Enumerator using to_enumwill conserve laziness.
None of this could create a regression, since Lazy & RETURN_SIZED_ENUMERATOR are both new to 2.0.0
I'm working on a patch...
        
           Updated by marcandre (Marc-Andre Lafortune) over 12 years ago
          Updated by marcandre (Marc-Andre Lafortune) over 12 years ago
          
          
        
        
      
      Patch almost done, which also fixes #7248
https://github.com/marcandre/ruby/compare/marcandre:master...marcandre:lazy
Still missing:
- tweak inspect
- fix .lazy.size
- couple more tests
        
           Updated by marcandre (Marc-Andre Lafortune) over 12 years ago
          Updated by marcandre (Marc-Andre Lafortune) over 12 years ago
          
          
        
        
      
      Patch updated, rdoc improved too.
Makes for a clean API for Lazy#new also, and there's even less code (~20 loc).
I'll review the patch one last time before committing it (in about 5 hours).
        
           Updated by marcandre (Marc-Andre Lafortune) over 12 years ago
          Updated by marcandre (Marc-Andre Lafortune) over 12 years ago
          
          
        
        
      
      - Status changed from Assigned to Closed
- % Done changed from 0 to 100
This issue was solved with changeset r39058.
Marc-Andre, thank you for reporting this issue.
Your contribution to Ruby is greatly appreciated.
May Ruby be with you.
- 
enumerator.c: Use to_enum for Enumerable methods returning Enumerators. 
 This makes Lazy#cycle no longer needed, so it was removed.
 Make Enumerator#chunk and slice_before return lazy Enumerators.
 [Bug #7715]
- 
internal.h: Remove ref to rb_enum_cycle_size; no longer needed 
- 
enum.c: Make enum_cycle_size static. 
- 
test/ruby/test_lazy_enumerator.rb: Test for above