Feature #16816
openPrematurely terminated Enumerator should stay terminated
Description
When iterating over an Enumerator, there are three different possible results of calling next
:
- The next item is returned and a cursor is advanced
- There's no next item and the Enumerator will forever raise
StopIteration
- There's an error getting the next item which is raised out of
next
This third case has some unexpected behavior that I discovered while working on https://github.com/jruby/jruby/issues/6157
It seems that when an Enumerator fails prematurely with an exception, any subsequent call to #next will cause it to restart.
This can be seen in a simple script I used to write a ruby/spec in https://github.com/jruby/jruby/pull/6190
Enumerator.new {
2.times {|i| raise i.to_s }
}.tap {|f|
p 2.times.map { f.next rescue $!.message }
}
The output from this is [0, 0]
. After the iteration fails, the second next
call causes it to restart and it fails again.
Contrast this to the behavior of item 3 above; when an Enumerator finishes iterating without error, it remains "finished" forever and can't be restarted.
I believe the restarting behavior is at best undocumented behavior and at worst incorrect and unspected. Take this example:
e = Enumerator.new { |y|
c = new_database_cursor
5.times { y.yield c.next_result }
}
If next_result
here raises an error, a subsequent call to next
on this enumerator will cause it to restart, re-acquire the cursor, and begin again.
As another example I ask a question: how do you indicate that an Enumerator failed due to an error, and keep it failed so it doesn't restart again?
Updated by headius (Charles Nutter) over 4 years ago
A simpler way to say why I believe this is a bug...
If I have an Enumerator with custom logic, and it is constructed and used once, I expect the iteration block will be entered exactly once and exited exactly once.
Updated by shevegen (Robert A. Heiler) over 4 years ago
( Just a brief typo-correction, I think you meant the link https://github.com/jruby/jruby/issues/6157 and not https://github.com/jruby/jruby/issue/6157 )
Updated by headius (Charles Nutter) over 4 years ago
- Description updated (diff)
Thank you, I've fixed the link.
Updated by ko1 (Koichi Sasada) over 4 years ago
Do you propose that the following your example:
Enumerator.new {
2.times {|i| raise i.to_s }
}.tap {|f|
p 2.times.map { f.next rescue $!.message }
}
should outputs ["0", "iteration reached an end"]
?
Updated by mame (Yusuke Endoh) over 4 years ago
A simpler code for dev-meeting:
g = Enumerator.new { raise "error" }
p((g.next rescue $!)) #=> "error" (as expected)
p((g.next rescue $!)) #=> actual: "error", expected: StopIteration
Updated by Eregon (Benoit Daloze) over 4 years ago
I agree this seems a bug, the expected
of the example by @mame (Yusuke Endoh) makes sense.
For @ko1's example, yes, because the exception will reach the end of the Enumerator (and Fiber), so there is nothing to Fiber.yield/Enumerator#next after.
Updated by jbeschi (jacopo beschi) about 4 years ago
I'd like to try working on this fix: is it possible?
Updated by jeremyevans0 (Jeremy Evans) over 3 years ago
- Tracker changed from Bug to Feature
- ruby -v deleted (
all) - Backport deleted (
2.5: UNKNOWN, 2.6: UNKNOWN, 2.7: UNKNOWN)
I prepared a commit to fix this: https://github.com/jeremyevans/ruby/commit/851534bbffd87c79bb63e8df36d6a47cc821aef0
Unfortunately, it breaks a CSV test:
1) Failure:
TestCSVParseInvalid#test_ignore_invalid_line [/home/runner/work/ruby/ruby/src/test/csv/parse/test_invalid.rb:34]:
<false> expected but was
<true>.
This failure shows the general problem with the idea. Basically, when enumerating has side effects, such as reading from a file, after an error is raised, you may want to continue enumerating after the error. Because the file state has changed, this is not equivalent to restarting from the beginning, it's equivalent to picking up where you left off.
So this isn't a change we would want to make by default. We would only want it for stateless enumerators, and we don't currently differentiate between enumerators that have external state and those that do not.
I think the current behavior makes sense for enumerators having external state. So I conclude this is not a bug, but a feature request, and one that could only be made if Ruby decided to differentiate the two types of enumerators.