Feature #17312
closedNew methods in Enumerable and Enumerator::Lazy: flatten, product, compact
Added by zverok (Victor Shepelev) about 4 years ago. Updated almost 4 years ago.
Description
(The offspring of #16987, which was too vague/philosophical)
I propose to add to Enumerable
and Enumerator::Lazy
the following methods:
compact
product
flatten
All of them can be performed with a one-way enumerator. All of them make sense for situations other than "just an array". All of them can be used for processing large sequences, and therefore meaningful to add to Lazy
.
Updated by Dan0042 (Daniel DeLorme) about 4 years ago
What would be the interaction between Array#flatten and Enumerable#flatten ?
It's a big compatibility problem if flatten recursively applies to any Enumerable object within an array.
x = Struct.new(:a).new(1) #=> #<struct a=1>
[[[x]]].flatten #=> [x] currently
Enumerable === x #=> true
x.to_a #=> [1]
[[[x]]].flatten #=> [1] would be a problem
Updated by mame (Yusuke Endoh) about 4 years ago
I think Enumerable#compact
is trivial and maybe useful.
In regard to Enumrable#flatten
, I agree with @Dan0042's concern. I think that it should flatten only Array
elements, but it might look unnatural.
I'm unsure if Enumerable#product
is useful. Its arguments are repeatedly iterated, so the arguments should be Array
s?
It would be good to separate tickets for each method, and a draft patch would be helpful for discussion.
Updated by zverok (Victor Shepelev) about 4 years ago
@mame (Yusuke Endoh) @Dan0042 (Daniel DeLorme) Oh, you are right, starting to think from Enumerable::Lazy
perspective I've missed a huge incompatibility introduced by flatten
.
@mame (Yusuke Endoh) I'll split into several proposals+patches: Enumerable#compact
, and, I am starting to think now, maybe Enumerator#flatten
would make some sense. As for #product
, I just added it for completeness (as a method which also can work with unidirectional enumeration), I can't from the top of my head remember if I needed it some time in the past.
Updated by Dan0042 (Daniel DeLorme) about 4 years ago
I understand the thinking behind #flatten; if ary.flatten
is possible then why not ary.to_enum.flatten
? It should be isomorphic. But even with Enumerator the recursive aspect still represents a compatibility problem. So as long as the behavior of Array#flatten is not modified I think all this is trivial to implement:
module Enumerable
def compact(...); to_a.compact(...); end
def product(...); to_a.product(...); end
def flatten(...); to_a.flatten(...); end
end
edit: oops sorry, forgot the point was that Enumerator::Lazy#flatten should return a Enumerator::Lazy
Updated by zverok (Victor Shepelev) about 4 years ago
But even with Enumerator the recursive aspect still represents a compatibility problem.
I am not sure about its severity, though. I mean, Universe is big and sure somewhere in it there should be a code which has an array of enumerators and then does flatten
on them... But I am not sure there is much of this code in the wild.
I believe that this situation has the similar rarity class as the situation with code which does unless obj.respond_to?(:except)
and will be broken by newly introduced Hash#except
method... Like, every change is incompatibility for somebody, as https://xkcd.com/1172/ points, but Enumerator#flatten
seems quite innocent.
So as long as the behavior of Array#flatten is not modified I think all this is trivial to implement:
def flatten(...); to_a.flatten(...); end
Note that this ticket is a follow-up of #16987. What I interested in, is more usages for .lazy
, and eager implementation of Enumerator::Lazy#flatten
is definitely a no-go.
So, I actually could propose just Enumerator::Lazy#flatten
, but it seems quite weird that lazy enumerator can be flattened, while regular one can't.
Updated by p8 (Petrik de Heus) about 4 years ago
I was really suprised that #last isn't implemented in Enumerable while #first is.
Updated by zverok (Victor Shepelev) about 4 years ago
I was really suprised that
#last
isn't implemented in Enumerable while#first
is.
It is natural.
That's because Enumerable is "uni-directional" (it is not guaranteed that you can iterate through it more than once, and there is no way to go back). Imagine this:
lines = File.each_line('foo.txt')
lines.last # -- if it worked, ALL the file is already read here, and you can't do anything reasonable with it
Also, last
is "intuitively" cheap ("just give me last element, what's the problem?"), but as Enumerable
relies on each
, and each
only, Enumerable#last
would mean "go through entire each
till it would be exhausted, and give the last value", which might be very pricey.
All the methods I am trying to propose are compatible with uni-directional #each
Updated by p8 (Petrik de Heus) about 4 years ago
@zverok (Victor Shepelev) Thanks for the explanation. That makes a lot of sense!
Updated by matz (Yukihiro Matsumoto) about 4 years ago
- Related to Feature #16987: Enumerator::Lazy vs Array methods added
Updated by matz (Yukihiro Matsumoto) about 4 years ago
My opinion for each proposed method.
-
compact
- OK, I can imagine use-cases too -
product
- Negative; I concern about arguments (array or enumerable) -
flatten
- Negative; I concern about types of elements (array of enumerable)
If you want product
and flatten
in Enumerable
, submit separate issues to persuade me.
Matz.
Updated by zverok (Victor Shepelev) almost 4 years ago
PR for #compact
: https://github.com/ruby/ruby/pull/3851
Updated by nobu (Nobuyoshi Nakada) almost 4 years ago
- Status changed from Open to Closed