Project

General

Profile

Actions

Feature #20664

open

Add `before` and `until` options to Enumerator.produce

Added by knu (Akinori MUSHA) 4 months ago. Updated 3 months ago.

Status:
Open
Assignee:
-
Target version:
-
[ruby-core:118784]

Description

Enumerator.produce provides a nice way to generate an infinite sequence but is a bit awkward to define how to end a sequence. It lacks a simple and easy way to produce typical finite sequences in an intuitive syntax.

This proposal attempts to solve the problem by adding these two options to the method:

  • before: when provided, it is used as a predicate to determine if an iteration should end before a generated value gets yielded.
  • until: when provided, it is used as a predicate to determine if an iteration should end until after a generated value gets yielded.

Any value that responds to to_proc and returns a Proc object is accepted in these options.

A typical use case for the before option is traversing a tree structure to iterate over the ancestors or following/preceding siblings of a node.

The until option can be used when there is a clear definition of the "last" value to yield.

enum = Enumerator.produce(File, before: :nil?, &:superclass)
enum.to_a #=> [File, IO, Object, BasicObject]

enum = Enumerator.produce(3, until: :zero?, &:pred)
enum_to_a #=> [3, 2, 1, 0]

Files


Related issues 2 (1 open1 closed)

Related to Ruby master - Feature #14781: Enumerator.generateClosedActions
Related to Ruby master - Feature #20625: Object#chain_ofOpenActions
Actions #1

Updated by knu (Akinori MUSHA) 4 months ago

Actions #2

Updated by knu (Akinori MUSHA) 4 months ago

Updated by zverok (Victor Shepelev) 4 months ago

I am not sure about this API.

I think in language core there aren’t many APIs that accept just a symbol of a necessary method (only reduce(:+) comes to mind, and I am still not sure why this form exists, because it seems to have been introduced at the same time when Symbol#to_proc was, so reduce(:+) and reduce(&:+) were always co-existing).

Mostly callables are passed as a block (and therefore there can be only one); but some APIs accept another callable (any object with #call method, like Enumerator.new).

So, what if condition is not an method of the sequence?.. Should we accept callables, too? Or, what if the method’s user expects it to be a particular value (like until: 0), or a pattern (like before: 0..1).

The alternative is

Enumerator.produce(File, &:superclass).take_until(&:nil?)

...which is more or less the same character-count-wise, more powerful (any block can be used), and more atomic.

The one problem we don’t currently have neither Enumerable#take_until, nor Object#not_nil?, to write something like

# this wouldn’t work
Enumerator.produce(File, &:superclass).take_while(&:not_nil?)
# though one can use
Enumerator.produce(File, &:superclass).take_while(&:itself)
#=> [File, IO, Object, BasicObject]

...but in general, I suspect adding Enumerable#take_until to handle such cases (and #take_while_after while we are on it :)) might be more powerful addition to the language, useful in many situations.

Updated by knu (Akinori MUSHA) 4 months ago

This proposal is based on the potential use cases I have experienced over the years. I've rarely seen a need for infinite sequences that can be defined with produce, and that is why I want to give produce() a feature-complete constructor.

Almost all sequences have had clear and simple end conditions. Traversing a tree structure for ancestor or sibling nodes would be the most typical use case, and the predicates like nil? and root? are mostly enough. Type-based conditions and inclusion conditions are not much seen probably because sequences are likely to be homogeneous and there is rarely more than one or a range of terminal values.

Updated by knu (Akinori MUSHA) 4 months ago

These options should take callables in this proposal. Procs and Methods certainly meet the condition: "Any value that responds to to_proc and returns a Proc object is accepted in these options".
The implementation does not bother to call to_proc on Procs, though.

Updated by matheusrich (Matheus Richard) 3 months ago

The one problem we don’t currently have neither Enumerable#take_until, nor Object#not_nil?, to write something like

After proposing Object#chain_of, I realized how missing one of these really makes things harder than they need to.

With 3.4's it, the expression gets a bit more readable:

Enumerator.produce(File, &:superclass).take_while { !it.nil? }

IMO this pattern is common enough to deserve an optimization. #not_nil? would probably be harder to add (people will start talking about present? and how it is longer than !<>.nil?, so maybe proposing #take_until will be easier to get approval.

Updated by ufuk (Ufuk Kayserilioglu) 3 months ago · Edited

@matheusrich (Matheus Richard) In my opinion take_until might be an interesting method to add, but I think we are unnecessarily complicating the example with it and nil?.

The expression is simply:

Enumerator.produce(File, &:superclass).take_while(&:itself)

and it works perfectly fine and, IMO, is very readable for anyone who knows enough to reach for Enumerator.produce in the first place.

With respect to the original proposal in this ticket, I also find it a little awkward when Ruby methods take something callable other than blocks, but I understand the pragmatic use of the proposed to_procable keyword arguments that would satisfy the majority of cases where one would reach for Enumerator.produce.

Updated by zverok (Victor Shepelev) 3 months ago

These options should take callables in this proposal. Procs and Methods certainly meet the condition: "Any value that responds to to_proc and returns a Proc object is accepted in these options".

Oh, yeah, sorry, missed this part, focused just on Symbol examples.

Interesting, I don’t think we have any API in core like this—accepting anything to convert it #to_proc implicitly. Usually when the second callable needed, the agreement is that it should respond to #call, not to #to_proc (examples: Proc#>>, Enumerator#new).

I am not sure it is “how it should be” (not a lot of APIs like this anyway), but approach with #to_proc-able things passed in many keyword arguments seems more Rails-ish. Maybe it is time to accept it.

Updated by Eregon (Benoit Daloze) 3 months ago

One issue with take_until is: does it include the element for which it yielded true?
In the description example Enumerator.produce(3, until: :zero?, &:pred) the result does include 0.
But for Enumerator.produce(parent, &:parent_directory).take_until(&:nil?) the intention is to not include nil in the result.
Maybe we should have 2 variants of take_until, or a keyword argument whether to include the last element or not.

IMO before:/until: kwargs for Enumerator.produce feel too ad-hoc.
take_until is something I wished existed already.
But we need to address whether it includes the last element or not.

Updated by matheusrich (Matheus Richard) 3 months ago

IMO take_until shouldn't include the element. So the OP example should be:

Enumerator.produce(3, &:pred).take_until(&:negative?)

Updated by Eregon (Benoit Daloze) 3 months ago

I think that makes sense, as an opposite of take_while:

  • take_while takes all elements until the block returns falsy, and does not include that element which yielded falsy.
  • take_until takes all elements until the block returns truthy, and does not include that element which yielded truthy.

If clearly documented that probably solves most of the confusion.
So +1 from me to add take_until.

I wonder if there is value in having variants that do include the element that stops, but at least so far in the linked issues there seems to be no such use-case.

Updated by zverok (Victor Shepelev) 3 months ago · Edited

About does/doesn’t include the last element, there is a related inconclusive discussion: #18136

Basically, I believe that both take_while/take_until might have pairs of use cases:

# the last element not necessary
sequence.take_until(&:bad?)
# the last element is necessary: think `.` in a sentence
sequence.take_until(&:terminator?)

# the last element is not necessary
sequence.take_while(&:suitable?)
# the last element is necessary: think the last page on pagination
# (which doesn’t match the condition “have more pages” but is a meaningful element itself)
sequence.take_while(&:has_next_item?)

(We have a #take_while_after in our core_ext.rb, and use it regularly.)

The Enumerable#slice_after and #slice_before, for example, recognize that it might be either, but not take_while.

Updated by matheusrich (Matheus Richard) 3 months ago

The kwargs proposed here could be useful:

sequence.take_until(inclusive: true, &:terminator?)

Alternatively, we could always be inclusive and let people pop to remove the last element:

sequence.take_until(&:terminator?).pop
Actions

Also available in: Atom PDF

Like1
Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like1Like0Like0