Feature #6636
closedEnumerable#size
Description
Now that it has been made clear that Enumerable#count
never calls #size
and that we have Enumerable#lazy
, let me propose again an API for a lazy way to get the size of an Enumerable: Enumerable#size
.
- call-seq:
- enum.size # => nil, Integer or Float::INFINITY
- Returns the number of elements that will be yielded, without going through
- the iteration (i.e. lazy), or +nil+ if it can't be calculated lazily.
- perm = (1..100).to_a.permutation(4)
- perm.size # => 94109400
- perm.each_cons(2).size # => 94109399
- loop.size # => Float::INFINITY
- [42].drop_while.size # => nil
About 66 core methods returning enumerators would have a lazy size
, like each_slice
, permutation
or lazy.take
.
A few would have size
return nil
:
Array#{r}index, {take|drop}_while
Enumerable#find{_index}, {take|drop}_while
IO: all methods
Sized enumerators can also be created naturally by providing a block to to_enum
/enum_for
or a lambda to Enumerator.new
.
Example for to_enum
:
class Integer
def composition
return to_enum(:composition){ 1 << (self - 1) } unless block_given?
yield [] if zero?
downto(1) do |i|
(self - i).composition do |comp|
yield [i, *comp]
end
end
end
end
4.composition.to_a
# => [[4], [3, 1], [2, 2], [2, 1, 1], [1, 3], [1, 2, 1], [1, 1, 2], [1, 1, 1, 1]]
42.composition.size # => 2199023255552
Example for Enumerator.new
:
def lazy_product(*enums)
sizer = ->{
enums.inject(1) do |product, e|
break if (size = e.size).nil?
product * size
end
}
Enumerator.new(sizer) do |yielder|
# ... generate combinations
end
end
lazy_product(1..4, (1..3).each_cons(2)).size # => 8
lazy_product(1..4, (1..3).cycle).size # => Float::INFINITY
Files
Updated by marcandre (Marc-Andre Lafortune) over 12 years ago
- File enumsize.pdf enumsize.pdf added
Attaching one-minute slide
Updated by mame (Yusuke Endoh) over 12 years ago
- Status changed from Open to Assigned
Received. Thank you!
--
Yusuke Endoh mame@tsg.ne.jp
Updated by naruse (Yui NARUSE) over 12 years ago
How about adding Enumerator#receiver and define yourself with it.
diff --git a/enumerator.c b/enumerator.c
index f01ddd5..8e3ae9a 100644
--- a/enumerator.c
+++ b/enumerator.c
@@ -942,6 +942,32 @@ enumerator_inspect(VALUE obj)
}
/*
-
- call-seq:
-
- e.receiver -> object
-
-
- Returns the receiver of this enumerator.
- */
+static VALUE
+enumerator_receiver(VALUE obj)
+{
- struct enumerator *e;
- VALUE eobj;
- TypedData_Get_Struct(obj, struct enumerator, &enumerator_data_type, e);
- if (!e || e->obj == Qundef) {
-
return Qnil;
- }
- eobj = rb_attr_get(obj, id_receiver);
- if (NIL_P(eobj)) {
-
eobj = e->obj;
- }
- return eobj;
+}
+/*
- Yielder
*/
static void
@@ -1748,6 +1774,7 @@ InitVM_Enumerator(void)
rb_define_method(rb_cEnumerator, "feed", enumerator_feed, 1);
rb_define_method(rb_cEnumerator, "rewind", enumerator_rewind, 0);
rb_define_method(rb_cEnumerator, "inspect", enumerator_inspect, 0);
-
rb_define_method(rb_cEnumerator, "receiver", enumerator_receiver, 0);
/* Lazy */
rb_cLazy = rb_define_class_under(rb_cEnumerator, "Lazy", rb_cEnumerator);
irb(main):007:0> e="abcde".enum_for(:each_byte)
=> #<Enumerator: "abcde":each_byte>
irb(main):009:0> def e.size; receiver.bytesize; end
=> nil
irb(main):010:0> e.size
=> 5
Updated by marcandre (Marc-Andre Lafortune) over 12 years ago
Hi,
On Sat, Jul 21, 2012 at 2:43 AM, naruse (Yui NARUSE) naruse@airemix.jp wrote:
How about adding Enumerator#receiver and define yourself with it.
I agree receiver
could be helpful. One would also need the method
and the arguments
, as in my request #3714.
Still, it doesn't really address the issue.
If someone wants to write a library to output the progression, for example, it is still not possible for a general enumerable/enumerator.
The proposal is so that:
- it is standard so anyone can depend on it and also create their own enumerables/enumerators
- it can help in some calculations
- it can help to have generic progression reports
- etc.
Thanks
Updated by mame (Yusuke Endoh) over 12 years ago
- Status changed from Assigned to Feedback
Marc-Andre Lafortune,
We discussed your slide at the developer meeting (7/21).
Matz was positive to the spec of return value: Integer, Float::
INFINITY, and nil.
However, we couldn't understand what API is proposed for creating
an Enumeartor with size.
So, please revise and elaborate your API according to these two
points:
-
Enumerator.new(size) is not acceptable because of compatibility:
p Enumerator.new([1,2,3]).take(2) #=> [1, 2]
-
We cannot determine the size of enumerator when creating it:
a = [1]
e = a.permutation
a << 2
p e.to_a #=> [[1, 2], [2, 1]]So, the API may need to receive a code fragment that calculates
size, such as a Proc.
--
Yusuke Endoh mame@tsg.ne.jp
Updated by marcandre (Marc-Andre Lafortune) over 12 years ago
- Status changed from Feedback to Open
Hi,
mame (Yusuke Endoh) wrote:
Matz was positive to the spec of return value: Integer, Float::
INFINITY, and nil.
:-)
However, we couldn't understand what API is proposed for creating
an Enumeartor with size.So, please revise and elaborate your API according to these two
points:
Enumerator.new(size) is not acceptable because of compatibility:
p Enumerator.new([1,2,3]).take(2) #=> [1, 2]
Agreed.
I am proposing Enumerator.new(size_lambda){ block }
, i.e. only if a block is given, then the first argument can be a lambda/proc that can lazily compute the size.
The old syntax of Enumerator.new
without a block does not change meaning.
We cannot determine the size of enumerator when creating it:
a = [1]
e = a.permutation
a << 2
p e.to_a #=> [[1, 2], [2, 1]]So, the API may need to receive a code fragment that calculates
size, such as a Proc.
Agreed.
This is why I propose that to_enum
accepts a block that can calculate the size, and Enumerator.new with a block can accept a lambda/proc for the same.
Does this address the concerns?
I will be glad to propose a set of patches so we can experiment with this.¶
Marc-André
Updated by mame (Yusuke Endoh) over 12 years ago
Hello Marc-Andre
2012/7/24, marcandre (Marc-Andre Lafortune) ruby-core@marc-andre.ca:
Enumerator.new(size) is not acceptable because of compatibility:
p Enumerator.new([1,2,3]).take(2) #=> [1, 2]
Agreed.
I am proposingEnumerator.new(size_lambda){ block }
, i.e. only if a block
is given, then the first argument can be a lambda/proc that can lazily
compute the size.
This is just my guess, but matz will not like such a method whose
meaning of its argument varies depending on whether block is given
or not.
The old syntax of
Enumerator.new
without a block does not change meaning.
Is it okay that there is no way to specify size in this case?
We cannot determine the size of enumerator when creating it:
a = [1]
e = a.permutation
a << 2
p e.to_a #=> [[1, 2], [2, 1]]So, the API may need to receive a code fragment that calculates
size, such as a Proc.Agreed.
This is why I propose thatto_enum
accepts a block that can calculate the
size, and Enumerator.new with a block can accept a lambda/proc for the
same.
What argument(s) will the lambda/proc receive?
--
Yusuke Endoh mame@tsg.ne.jp
Updated by marcandre (Marc-Andre Lafortune) over 12 years ago
Hi,
mame (Yusuke Endoh) wrote:
I am proposing
Enumerator.new(size_lambda){ block }
, i.e. only if a block
is given, then the first argument can be a lambda/proc that can lazily
compute the size.This is just my guess, but matz will not like such a method whose
meaning of its argument varies depending on whether block is given
or not.
I understand the concern.
It could still be acceptable here because the other form is already documented as 'discouraged'. Maybe we should deprecate it?
Other possibility would be to add a different creator, e.g. Enumerator.sized(size_lambda){|yielder| ... }
.
The old syntax of
Enumerator.new
without a block does not change meaning.Is it okay that there is no way to specify size in this case?
This old syntax is already discouraged and to_enum
/enum_for
should be used instead.
This is why I propose that
to_enum
accepts a block that can calculate the
size, and Enumerator.new with a block can accept a lambda/proc for the
same.What argument(s) will the lambda/proc receive?
We could consider passing the receiver and/or any arguments passed to to_enum
, but I would propose to keep it simple and pass no arguments.
This is because enumerators are immutable and all information held in Enumerators should be accessible from the block/lambda anyways.¶
Marc-André
Updated by ko1 (Koichi Sasada) about 12 years ago
- Assignee changed from matz (Yukihiro Matsumoto) to mame (Yusuke Endoh)
Updated by mame (Yusuke Endoh) about 12 years ago
- Status changed from Open to Assigned
- Assignee changed from mame (Yusuke Endoh) to matz (Yukihiro Matsumoto)
- Target version changed from 2.0.0 to 2.6
Updated by marcandre (Marc-Andre Lafortune) about 12 years ago
Hi.
My understanding was this new feature would make it into Ruby 2.0. Did I misunderstand?
The implementation can be seen here: https://github.com/marcandre/ruby/compare/marcandre:trunk...marcandre:enum_size
Although the combined diff (https://github.com/marcandre/ruby/compare/marcandre:trunk...marcandre:enum_size.diff ) is pretty big, there are really just two interesting commits. The second adds #size and extends constructor. The third extends to_enum
to accept a block. See https://github.com/marcandre/ruby/commit/add_enumerator_size and https://github.com/marcandre/ruby/commit/sized_to_enum
The first commit only warns on using the deprecated form with no block Enumerator.new(obj, *args)
, but compatibility is maintained.
The remaining commits add support for the different ways of creating enumerators that can evaluate lazily their size. A few remain to be implemented, in particular the lazy ones.
Updated by marcandre (Marc-Andre Lafortune) about 12 years ago
I added support for lazy enumerators.
Yusuke, Matz, did I address your questions/concerns?
Updated by matz (Yukihiro Matsumoto) about 12 years ago
After skimming your modifies, I feel they are decent.
Sorry for being late to check.
Matz.
Updated by marcandre (Marc-Andre Lafortune) about 12 years ago
- Assignee changed from matz (Yukihiro Matsumoto) to marcandre (Marc-Andre Lafortune)
- Target version changed from 2.6 to 2.0.0
Hi,
matz (Yukihiro Matsumoto) wrote:
After skimming your modifies, I feel they are decent.
Thanks for looking at them. Very happy to read this :-)
I'll then commit these and finish implementing size for ranges of strings.¶
Marc-André
Updated by marcandre (Marc-Andre Lafortune) about 12 years ago
- Status changed from Assigned to Closed
- % Done changed from 0 to 100
This issue was solved with changeset r37495.
Marc-Andre, thank you for reporting this issue.
Your contribution to Ruby is greatly appreciated.
May Ruby be with you.
- enumerator.c (enumerator_initialize): Warn when using deprecated form
[Feature #6636]