Bug #7696: Lazy enumerators with state can't be rewound - Ruby - Ruby Issue Tracking System

Actions

Copy link

Bug #7696

closed

Lazy enumerators with state can't be rewound

Added by marcandre (Marc-Andre Lafortune) over 12 years ago. Updated almost 12 years ago.

Status:

Closed

Assignee:

matz (Yukihiro Matsumoto)

Target version:

2.1.0

ruby -v:

r38800

Backport:

[ruby-core:51430]

Description

The 4 lazy enumerators requiring internal state, i.e. {take|drop}{_while}, don't work as expected after a couple next and a call to rewind.

For example:

e=(1..42).lazy.take(2)
e.next # => 1
e.next # => 2
e.rewind
e.next # => 1
e.next # => StopIteration: iteration reached an end, expected 2

This is related to #7691; the current API does not give an easy way to handle state.

Either there's a dedicated callback to rewind, or data must be attached to the yielder.

Related issues 2 (0 open — 2 closed)

Actions

Copy link

Updated by shugo (Shugo Maeda) over 12 years ago

marcandre (Marc-Andre Lafortune) wrote:

The 4 lazy enumerators requiring internal state, i.e. {take|drop}{_while}, don't work as expected after a couple next and a call to rewind.

For example:
e=(1..42).lazy.take(2)
e.next # => 1
e.next # => 2
e.rewind
e.next # => 1
e.next # => StopIteration: iteration reached an end, expected 2
This is related to #7691; the current API does not give an easy way to handle state.

#7691 is the basically same issue as #6142.

Do you have a draft API spec in mind?
It might be too late to introduce a new API for 2.0.0, though.

Actions

Copy link

#2 [ruby-core:51469]

Updated by marcandre (Marc-Andre Lafortune) over 12 years ago

I had not see #6142. Yes, it's the same.

Add attr_accessor to Yielder and document the fact that the yielder is unique to a single enumeration run.

def drop(n)
n = Backports::coerce_to_int(n)
Lazy.new(self) do |yielder, *values|
yielder.memo ||= n
if yielder.memo > 0
yielder.memo -= 1
else
yielder.yield(*values)
end
end
end
Optional named argument prepare for Lazy.new to be called before each:

def drop(n)
remain = nil
Lazy.new(self, prepare: ->{remain = n}) do |yielder, *values|
if remain > 0
remain -= 1
else
yielder.yield(*values)
end
end
end
Add a method 'prepare' method to Enumerator::Lazy and encourage subclassing. prepare would be called before each

class LazyDropper < Enumerator::Lazy
def initialize(obj, n)
@n = n
super do |yielder, *values|
if @remain > 0
@remain -= 1
else
yielder.yield(*values)
end
end
```
def prepare
  @remain = @n
end
```
end

MRI should use the same kind of mechanism for the enums requiring state

Actions

Copy link

#3 [ruby-core:51470]

Updated by marcandre (Marc-Andre Lafortune) over 12 years ago

Oh oh.

Let me add a third example to problems of handling states:

e = (1..6).lazy.drop(3)
e.flat_map{e}.force # => [*(1..6)]*3, should be  [*(1..3)]*3

I believe only the first solution would work for this.

Actions

Copy link

#4 [ruby-core:51471]

Updated by marcandre (Marc-Andre Lafortune) over 12 years ago

Here's a patch for take. Does it look ok?

diff --git a/enumerator.c b/enumerator.c
index b65712f..1522a3f 100644
--- a/enumerator.c
+++ b/enumerator.c
@@ -105,7 +105,7 @@
VALUE rb_cEnumerator;
VALUE rb_cLazy;
static ID id_rewind, id_each, id_new, id_initialize, id_yield, id_call, id_size;
-static ID id_eqq, id_next, id_result, id_lazy, id_receiver, id_arguments, id_method, id_force;
+static ID id_eqq, id_next, id_result, id_lazy, id_receiver, id_arguments, id_memo, id_method, id_force;
static VALUE sym_each, sym_cycle;

VALUE rb_eStopIteration;
@@ -1641,14 +1641,18 @@ lazy_zip(int argc, VALUE *argv, VALUE obj)
static VALUE
lazy_take_func(VALUE val, VALUE args, int argc, VALUE *argv)
{

NODE *memo = RNODE(args);

long remain;
VALUE memo = rb_ivar_get(argv[0], id_memo);
if (NIL_P(memo)) {
```
  memo = args;
```
}

rb_funcall2(argv[0], id_yield, argc - 1, argv + 1);

if (--memo->u3.cnt == 0) {
```
  memo->u3.cnt = memo->u2.argc;
```

if ((remain = NUM2LONG(memo)-1) == 0) {
return Qundef;
}
else {
```
  rb_ivar_set(argv[0], id_memo, LONG2NUM(remain));
  return Qnil;
```
}
}
@@ -1666,7 +1670,6 @@ lazy_take_size(VALUE lazy)
static VALUE
lazy_take(VALUE obj, VALUE n)
{

NODE *memo;
long len = NUM2LONG(n);
int argc = 1;
VALUE argv[3];
@@ -1680,9 +1683,8 @@ lazy_take(VALUE obj, VALUE n)
argv[2] = INT2NUM(0);
argc = 3;
}
memo = NEW_MEMO(0, len, len);
return lazy_set_method(rb_block_call(rb_cLazy, id_new, argc, argv,

                                   lazy_take_func, (VALUE) memo),

                                   lazy_take_func, n),
                     rb_ary_new3(1, n), lazy_take_size);

}

@@ -1955,6 +1957,7 @@ Init_Enumerator(void)
id_eqq = rb_intern("===");
id_receiver = rb_intern("receiver");
id_arguments = rb_intern("arguments");

id_memo = rb_intern("memo");
id_method = rb_intern("method");
id_force = rb_intern("force");
sym_each = ID2SYM(id_each);
diff --git a/test/ruby/test_lazy_enumerator.rb b/test/ruby/test_lazy_enumerator.rb
index acd4843..35e92c9 100644
--- a/test/ruby/test_lazy_enumerator.rb
+++ b/test/ruby/test_lazy_enumerator.rb
@@ -243,6 +243,23 @@ class TestLazyEnumerator < Test::Unit::TestCase
assert_equal((1..5).to_a, take5.force, bug6428)
end
def test_take_nested
bug7696 = '[ruby-core:51470]'
a = Step.new(1..10)
take5 = a.lazy.take(5)
assert_equal([*(1..5)]*5, take5.flat_map{take5}.force, bug7696)
end
def test_take_rewound
bug7696 = '[ruby-core:51470]'
e=(1..42).lazy.take(2)
assert_equal 1, e.next
assert_equal 2, e.next
e.rewind
assert_equal 1, e.next
assert_equal 2, e.next
end
def test_take_while
a = Step.new(1..10)
assert_equal(1, a.take_while {|i| i < 5}.first)

Actions

Copy link

#5 [ruby-core:51515]

Updated by shugo (Shugo Maeda) over 12 years ago

marcandre (Marc-Andre Lafortune) wrote:

Here's a patch for take. Does it look ok?

Does it work well for zip?
I wonder how arguments are rewound.

Actions

Copy link

#6 [ruby-core:51521]

Updated by marcandre (Marc-Andre Lafortune) over 12 years ago

The same idea will work for zip too; the arguments must be converted to enumerators only when yielder.memo is not set, i.e. every new yielder.

The arguments are never really rewound, but the first yielder holding them will not be reused after enum.rewind; a new yielder is given.
I haven't checked if that's the current behavior in other implementations, but if yielder is to be the state holder, that's the way it should work.

enum = (1..2).lazy.zip(1..2)
enum.next # => yielder.memo was nil, so (1..2).each is called and stored in the memo
enum.rewind # => the original yielder is discarded
enum.next # => the second yielder.memo is nil, so (1..2).each is called again and stored in the memo

I'm using the same idea in my backports gem to implement this in pure Ruby:

https://github.com/marcandre/backports/blob/master/lib/backports/2.0.0/enumerator/lazy.rb

The only other possibility that would work is to pass yield extra 'state' argument when required, like:

def drop(n)
Lazy.new(self, state: ->{{remain: n}}) do |yielder, state, *values|
if state[:remain] > 0
state[:remain] -= 1
else
yielder.yield(*values)
end
end
end

Maybe the :state option could be an object instead of a lmabda, which would be dupped before each enumerations:

def drop(n)
Lazy.new(self, state: {remain: n}) do |yielder, state, *values|
if state[:remain] > 0
state[:remain] -= 1
else
yielder.yield(*values)
end
end
end

Both solutions seem complex to me, but would definitely put the correct emphasis on how to deal with state.

So, does the patch look acceptable as far as MRI is concerned?
For the public api, should there be a public Yielder#memo and a guarantee that there is exactly one yielder object per iteration? or instead an extra state yielded when required?

Actions

Copy link

#7 [ruby-core:51542]

Updated by shugo (Shugo Maeda) over 12 years ago

marcandre (Marc-Andre Lafortune) wrote:

The same idea will work for zip too; the arguments must be converted to enumerators only when yielder.memo is not set, i.e. every new yielder.

The arguments are never really rewound, but the first yielder holding them will not be reused after enum.rewind; a new yielder is given.
I haven't checked if that's the current behavior in other implementations, but if yielder is to be the state holder, that's the way it should work.

So, the following behavior is intended, right?

$ ruby1.9.3 -I ~/src/backports/lib -r backports -r backports/2.0.0/enumerable -e "a = (1..3).lazy.zip(('a'..'z').each); p a.to_a; p a.to_a"
[[1, "a"], [2, "b"], [3, "c"]]
[[1, "d"], [2, "e"], [3, "f"]]

So, does the patch look acceptable as far as MRI is concerned?

If the above behavior is intended, the patch looks acceptable.

For the public api, should there be a public Yielder#memo and a guarantee that there is exactly one yielder object per iteration? or instead an extra state yielded when required?

It might be too late to introduce a new API for 2.0.0, so how about to move it to next minor?

Actions

Copy link

#8 [ruby-core:51551]

Updated by marcandre (Marc-Andre Lafortune) over 12 years ago

shugo (Shugo Maeda) wrote:

So, the following behavior is intended, right?

$ ruby1.9.3 -I ~/src/backports/lib -r backports -r backports/2.0.0/enumerable -e "a = (1..3).lazy.zip(('a'..'z').each); p a.to_a; p a.to_a"
[[1, "a"], [2, "b"], [3, "c"]]
[[1, "d"], [2, "e"], [3, "f"]]

That's a very good question.

It probably would be best to call to_enum instead of each. Calling next|rewind on an enumerator should really only affect other calls to next. With to_enum, we'll get the same result every time.

Similarly, we expect (1..3).zip(('a'..'z').each.tap(&:next)) to return [[1, 'a'], ..., and not [[1, 'b'], ... even though next was called on the given enumerator.

If the above behavior is intended, the patch looks acceptable.

Thanks for reviewing it. I'll commit it, changing the call to each to to_enum (unless there are objections). I'll use the same technique to fix the other 3 lazy enumerators with state.

For the public api, should there be a public Yielder#memo and a guarantee that there is exactly one yielder object per iteration? or instead an extra state yielded when required?

It might be too late to introduce a new API for 2.0.0, so how about to move it to next minor?

I understand. On the other hand, the API for Lazy.new was never decided upon and really should be finalized before 2.0.0!

If we opt for the Yielder#memo way (and agree on the name), maybe we can convince mame to accept such a trivial change. In that case, the biggest "change" is the explicit guarantee of a different yielder per iteration. It's already the case (also for JRuby and rubinius), but AFAIK it's never been official.

With the modified yielding way, it would be nice to include it in Lazy.new's api in this version, especially since it is still being finalized.

At the very least, a note in the documentation about handling state would be needed for 2.0.0.

Actions

Copy link

#9 [ruby-core:51555]

Updated by shugo (Shugo Maeda) over 12 years ago

Assignee set to matz (Yukihiro Matsumoto)

marcandre (Marc-Andre Lafortune) wrote:

shugo (Shugo Maeda) wrote:

So, the following behavior is intended, right?

$ ruby1.9.3 -I ~/src/backports/lib -r backports -r backports/2.0.0/enumerable -e "a = (1..3).lazy.zip(('a'..'z').each); p a.to_a; p a.to_a"
[[1, "a"], [2, "b"], [3, "c"]]
[[1, "d"], [2, "e"], [3, "f"]]

That's a very good question.

It probably would be best to call to_enum instead of each. Calling next|rewind on an enumerator should really only affect other calls to next. With to_enum, we'll get the same result every time.

Even if to_enum is called, IO instances never be rewound, but I guess IO instances need not be rewound.

It might be too late to introduce a new API for 2.0.0, so how about to move it to next minor?

I understand. On the other hand, the API for Lazy.new was never decided upon and really should be finalized before 2.0.0!

If we opt for the Yielder#memo way (and agree on the name), maybe we can convince mame to accept such a trivial change. In that case, the biggest "change" is the explicit guarantee of a different yielder per iteration. It's already the case (also for JRuby and rubinius), but AFAIK it's never been official.

`The explicit guarantee of a different yielder per iteration' sounds acceptable for me, but I'm not sure whether Yielder#memo is a good name.

With the modified yielding way, it would be nice to include it in Lazy.new's api in this version, especially since it is still being finalized.

At the very least, a note in the documentation about handling state would be needed for 2.0.0.

I believe Matz should decide it.

Actions

Copy link

#10 [ruby-core:51556]

Updated by shugo (Shugo Maeda) over 12 years ago

Status changed from Open to Assigned

Actions

Copy link

#11

Updated by marcandre (Marc-Andre Lafortune) over 12 years ago

Status changed from Assigned to Closed
% Done changed from 0 to 100

This issue was solved with changeset r38920.
Marc-Andre, thank you for reporting this issue.
Your contribution to Ruby is greatly appreciated.
May Ruby be with you.

enumerator.c: Fix state handling for Lazy#take
[bug #7696]
test/ruby/test_lazy_enumerator.rb: test for above

Actions

Copy link

#12 [ruby-core:51615]

Updated by marcandre (Marc-Andre Lafortune) over 12 years ago

Status changed from Closed to Open

Bugs in MRI are fixed, but keeping open so Matz can decide how users should handle state.

Actions

Copy link

#13 [ruby-core:52361]

Updated by mame (Yusuke Endoh) over 12 years ago

Subject changed from Lazy enumerators with state can't be rewound to Lazy enumerators with state can't be rewound
Status changed from Open to Assigned
Target version changed from 2.0.0 to 2.6

Actions

Copy link

#14 [ruby-core:56424]

Updated by naruse (Yui NARUSE) about 12 years ago

Target version changed from 2.6 to 2.1.0

Actions

Copy link

#15 [ruby-core:56896]

Updated by marcandre (Marc-Andre Lafortune) almost 12 years ago

Status changed from Assigned to Closed

I created a new feature request #8840, so I'm closing this.

Actions

Copy link

Also available in: Atom PDF

Like0

Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0

Project

General

Profile

Ruby

Tags

Custom queries

Bug #7696

Lazy enumerators with state can't be rewound

Updated by shugo (Shugo Maeda) over 12 years ago

Updated by marcandre (Marc-Andre Lafortune) over 12 years ago

Updated by marcandre (Marc-Andre Lafortune) over 12 years ago

Updated by marcandre (Marc-Andre Lafortune) over 12 years ago

Updated by shugo (Shugo Maeda) over 12 years ago

Updated by marcandre (Marc-Andre Lafortune) over 12 years ago

Updated by shugo (Shugo Maeda) over 12 years ago

Updated by marcandre (Marc-Andre Lafortune) over 12 years ago

Updated by shugo (Shugo Maeda) over 12 years ago

Updated by shugo (Shugo Maeda) over 12 years ago

Updated by marcandre (Marc-Andre Lafortune) over 12 years ago

Updated by marcandre (Marc-Andre Lafortune) over 12 years ago

Updated by mame (Yusuke Endoh) over 12 years ago

Updated by naruse (Yui NARUSE) about 12 years ago

Updated by marcandre (Marc-Andre Lafortune) almost 12 years ago