Project

General

Profile

Actions

Feature #16816

open

Prematurely terminated Enumerator should stay terminated

Added by headius (Charles Nutter) about 1 year ago. Updated about 1 month ago.

Status:
Open
Priority:
Normal
Assignee:
-
Target version:
-
[ruby-core:98059]

Description

When iterating over an Enumerator, there are three different possible results of calling next:

  1. The next item is returned and a cursor is advanced
  2. There's no next item and the Enumerator will forever raise StopIteration
  3. There's an error getting the next item which is raised out of next

This third case has some unexpected behavior that I discovered while working on https://github.com/jruby/jruby/issues/6157

It seems that when an Enumerator fails prematurely with an exception, any subsequent call to #next will cause it to restart.

This can be seen in a simple script I used to write a ruby/spec in https://github.com/jruby/jruby/pull/6190

Enumerator.new {
  2.times {|i| raise i.to_s }
}.tap {|f|
  p 2.times.map { f.next rescue $!.message }
}

The output from this is [0, 0]. After the iteration fails, the second next call causes it to restart and it fails again.

Contrast this to the behavior of item 3 above; when an Enumerator finishes iterating without error, it remains "finished" forever and can't be restarted.

I believe the restarting behavior is at best undocumented behavior and at worst incorrect and unspected. Take this example:

e = Enumerator.new { |y|
  c = new_database_cursor
  5.times { y.yield c.next_result }
}

If next_result here raises an error, a subsequent call to next on this enumerator will cause it to restart, re-acquire the cursor, and begin again.

As another example I ask a question: how do you indicate that an Enumerator failed due to an error, and keep it failed so it doesn't restart again?

Updated by headius (Charles Nutter) about 1 year ago

A simpler way to say why I believe this is a bug...

If I have an Enumerator with custom logic, and it is constructed and used once, I expect the iteration block will be entered exactly once and exited exactly once.

Updated by headius (Charles Nutter) about 1 year ago

  • Description updated (diff)

Thank you, I've fixed the link.

Updated by ko1 (Koichi Sasada) about 1 year ago

Do you propose that the following your example:

Enumerator.new {
  2.times {|i| raise i.to_s }
}.tap {|f|
  p 2.times.map { f.next rescue $!.message }
}

should outputs ["0", "iteration reached an end"] ?

Updated by mame (Yusuke Endoh) about 1 year ago

A simpler code for dev-meeting:

g = Enumerator.new { raise "error" }

p((g.next rescue $!)) #=> "error" (as expected)
p((g.next rescue $!)) #=> actual: "error", expected: StopIteration

Updated by Eregon (Benoit Daloze) about 1 year ago

I agree this seems a bug, the expected of the example by mame (Yusuke Endoh) makes sense.

For ko1 (Koichi Sasada)'s example, yes, because the exception will reach the end of the Enumerator (and Fiber), so there is nothing to Fiber.yield/Enumerator#next after.

Updated by jbeschi (jacopo beschi) 5 months ago

I'd like to try working on this fix: is it possible?

Updated by jeremyevans0 (Jeremy Evans) about 1 month ago

  • Backport deleted (2.5: UNKNOWN, 2.6: UNKNOWN, 2.7: UNKNOWN)
  • ruby -v deleted (all)
  • Tracker changed from Bug to Feature

I prepared a commit to fix this: https://github.com/jeremyevans/ruby/commit/851534bbffd87c79bb63e8df36d6a47cc821aef0

Unfortunately, it breaks a CSV test:

  1) Failure:
TestCSVParseInvalid#test_ignore_invalid_line [/home/runner/work/ruby/ruby/src/test/csv/parse/test_invalid.rb:34]:
<false> expected but was
<true>.

This failure shows the general problem with the idea. Basically, when enumerating has side effects, such as reading from a file, after an error is raised, you may want to continue enumerating after the error. Because the file state has changed, this is not equivalent to restarting from the beginning, it's equivalent to picking up where you left off.

So this isn't a change we would want to make by default. We would only want it for stateless enumerators, and we don't currently differentiate between enumerators that have external state and those that do not.

I think the current behavior makes sense for enumerators having external state. So I conclude this is not a bug, but a feature request, and one that could only be made if Ruby decided to differentiate the two types of enumerators.

Actions

Also available in: Atom PDF