Feature #21701
openEnumerator.produce accepts an optional `size` keyword argument
Description
Enumerator::Producer#size currently always returns Float::INFINITY, and it is not specifiable.
However, a produce sequence is known to be at least finite in many cases, and you can even tell or compute the exact size in some cases.
# Infinite enumerator
enum = Enumerator.produce(1, size: Float::INFINITY, &:succ)
enum.size # => Float::INFINITY
# Finite enumerator with known/computable size
abs_dir = File.expand_path("./baz") # => "/foo/bar/baz"
traverser = Enumerator.produce(abs_dir, size: -> { abs_dir.count("/") + 1 }) {
raise StopIteration if it == "/"
File.dirname(it)
}
traverser.size # => 4
As a background, Enumerator#to_set was changed to refuse to work against an infinite enumerator in https://bugs.ruby-lang.org/issues/21513.
That made #to_set always fail against produce sequences. This enhancement will fix the problem (kind of) safely, and we will also be able to add the same size check to #to_a, which is currently missing and inconsistent with #to_set.
Here's the implementation: https://github.com/ruby/ruby/pull/15277
Updated by knu (Akinori MUSHA) 21 days ago
- Related to Bug #21513: Converting endless range to set hangs added
Updated by Eregon (Benoit Daloze) 8 days ago
ยท Edited
I think having to provide a size for those cases to not raise is too inconvenient.
I agree with what has been said in #21654: we should not check raise for infinite loops, unless we are absolutely certain they are infinite.
A size of nil means unknown, it does not mean infinite, so it's always incorrect to raise if the size is unknown.
Updated by Eregon (Benoit Daloze) 8 days ago
Ah, I missed that Enumerator.produce(...).size returns Infinity.
I think that's the problem, it should return nil because it doesn't know if infinite or not, as you say in many cases it is not infinite.
If that's done, I think adding an optional size is fine, although probably not so useful, because people can just do def enum.size = ... (though that creates a singleton class so not ideal).