Feature #17016
openEnumerable#accumulate
Added by parker (Parker Finch) over 4 years ago. Updated over 3 years ago.
Description
Proposal¶
UPDATE: Changed proposed method name from #scan_left
to #accumulate
.
Add an #accumulate
method to Enumerable
.
Background¶
#accumulate
is similar to #inject
, but it accumulates the partial results that are computed. As a comparison:
[1, 2, 3].inject(0, &:+) => 6
[1, 2, 3].accumulate(0, &:+) => [0, 1, 3, 6]
Notably, the accumulate
operation can be done lazily since it doesn't require processing the entire collection before computing a value.
I recently described #accumulate
, and its relationship to #inject
, more thoroughly in this blog post.
Reasoning¶
We heavily rely on the accumulate operation. We use an event-sourcing pattern, which means that we are scanning over individual "events" and building up the corresponding state. We rely on the history of states and need to do this lazily (we stream events because they cannot fit in memory). Thus the scan operation is much more applicable than the inject operation.
We suspect that there are many applications that could leverage the scan operation. This question would be more easily answered by #accumulate
. It is a natural fit for any application that needs to store the incrementally-computed values of an #inject
, and a requirement for an application that needs to use #inject
while maintaining laziness.
Implementation¶
There is a Ruby implementation of this functionality here and an implementation in C here.
Update: @nobu (Nobuyoshi Nakada) has provided an alternative implementation here.
Counterarguments¶
Introducing a new public method is committing to maintenance going forward and expands the size of the Ruby codebase -- it should not be done lightly. I think that providing the functionality here is worth the tradeoff, but I understand any hesitation to add yet more to Ruby!
Files
scan_left_example.rb (2.93 KB) scan_left_example.rb | Small script showing `scan` usage with other approaches to the same behavior | parker (Parker Finch), 07/10/2020 08:52 PM |
Updated by sawa (Tsuyoshi Sawada) over 4 years ago
What is wrong with the following?
[1, 2, 3].inject([0]){|a, e| a << a.last + e} # => [0, 1, 3, 6]
[1, 2, 3].each_with_object([0]){|e, a| a << a.last + e} # => [0, 1, 3, 6]
Updated by parker (Parker Finch) over 4 years ago
sawa (Tsuyoshi Sawada) wrote in #note-1:
What is wrong with the following?
[1, 2, 3].inject([0]){|a, e| a << a.last + e} # => [0, 1, 3, 6] [1, 2, 3].each_with_object([0]){|e, a| a << a.last + e} # => [0, 1, 3, 6]
Good question! Using #inject
or #each_with_object
cannot be done lazily. We need to pass a lazy enumerator after applying the fold-like operation. As a toy example, instead of [1, 2, 3]
we can use (1..).lazy
:
(1..).lazy.inject([0]){|a, e| a << a.last + e} # => infinite loop
(1..).lazy.each_with_object([0]){|e, a| a << a.last + e} # => infinite loop
(1..).lazy.scan_left(0, &:+) # => Lazy enumerator
Updated by Eregon (Benoit Daloze) over 4 years ago
The name scan
seems confusing at least, since it has nothing to do with String#scan
.
And *_left
has no precedent in Ruby.
It seems it's basically recording the history of each block call (+ the initial value), maybe a name expressing that would be good.
Do you have real-world usages you can show?
It seems kind of niche (seems expensive if the collection is large) and can easily be replaced by each
+ 2 variables (if not lazy at least).
Updated by parker (Parker Finch) over 4 years ago
- File scan_left_example.rb scan_left_example.rb added
The name
scan
seems confusing at least, since it has nothing to do withString#scan
.
And*_left
has no precedent in Ruby.It seems it's basically recording the history of each block call (+ the initial value), maybe a name expressing that would be good.
Ah sorry, I forgot to mention that the term "scan" is what this operation is typically called in functional programming (see here for a quick overview). So that's how the name is derived. I agree that the *_left
suffix doesn't sound very Ruby-ish, maybe just #scan
would be better. But then we really have a conflict with the name of String#scan
, thanks for pointing that out! Any thoughts on what this could be called to disambiguate that? Perhaps #accumulate
?
Do you have real-world usages you can show?
The most significant real-world usage that I've had it for is the one I briefly described above. The "scan" operation is a very good fit when processing streams of data changes. To really concretize it, our use case is in education data. One example is that each day we see if a student is present or absent. We use a scan over that data to count the number of absences that a student has -- a fold (i.e. #inject
or #reduce
) would be insufficient because the stream is lazy.
To get a small, self-contained example, I wrote up a script that uses it (and shows how other implementations of the behavior could work). That is attached, let me know if there's anything else I can do to show the usefulness here!
It seems kind of niche (seems expensive if the collection is large) and can easily be replaced by
each
+ 2 variables (if not lazy at least).
I don't think it's actually that niche! A lot of array algorithms become easier with a scan
operation, see this post for an example. And I don't think that it's particularly expensive -- it should be similar in cost to a map
.
In fact, scan-like behavior (with laziness) can be implemented with map
without too much trouble. (This is the heart of the ruby implementation.) But assigning to a variable inside of the block passed to map
doesn't feel very Ruby-ish to me:
val = 0
collection.map { |x| val = val + x }
feels less natural than something like
collection.scan(&:+)
One of my favorite aspects of Ruby is how easy it is to write in a functional programming style, and including the scan
operation would expand the number of use cases covered by functional methods.
Updated by nobu (Nobuyoshi Nakada) over 4 years ago
parker (Parker Finch) wrote in #note-2:
Good question! Using
#inject
or#each_with_object
cannot be done lazily. We need to pass a lazy enumerator after applying the fold-like operation. As a toy example, instead of[1, 2, 3]
we can use(1..).lazy
:(1..).lazy.inject([0]){|a, e| a << a.last + e} # => infinite loop (1..).lazy.each_with_object([0]){|e, a| a << a.last + e} # => infinite loop (1..).lazy.scan_left(0, &:+) # => Lazy enumerator
That sounds like Enumerator::Lazy
needs #inject
and #each_with_object
methods to me.
Updated by Eregon (Benoit Daloze) over 4 years ago
parker (Parker Finch) wrote in #note-4:
Thanks for your reply, I think it will help to discuss this issue at the dev meeting.
Maybe prefix_sum
or just prefix
or something like that would work?
Having sum
in it is kind of confusing though as it can be any "operation" not just +
-ing numbers, but it seems an official "term" for it (https://en.wikipedia.org/wiki/Prefix_sum#Scan_higher_order_function).
But assigning to a variable inside of the block passed to
map
doesn't feel very Ruby-ish to me:val = 0 collection.map { |x| val = val + x }
It's just my opinion, but I see nothing wrong with that (details: it could be val += x
).
I'd even go as far as saying each_with_object
is often less readable than using a captured variable.
I think "purely functional, not a single re-assigned variable" often introduces significant extra complexity, when Ruby is a language that embraces both functional and imperative programming.
And anyway each_with_object
is only useful if mutating some object (typically an Array or Hash).
Again, it's just my opinion :)
Updated by Eregon (Benoit Daloze) over 4 years ago
nobu (Nobuyoshi Nakada) wrote in #note-5:
That sounds like
Enumerator::Lazy
needs#inject
and#each_with_object
methods to me.
A lazy #inject
sounds useful.
Updated by nobu (Nobuyoshi Nakada) over 4 years ago
Updated by Eregon (Benoit Daloze) over 4 years ago
nobu (Nobuyoshi Nakada) wrote in #note-8:
It has a backward incompatibility on the return value.
I’m afraid if it’s acceptable.
Right, inject
might return anything, not necessarily an Array and so returning a "lazy Enumerator" instead will be unwanted in at least some cases.
Using the existing lazy map
seems most natural to me:
val = 0
p (0..).lazy.map { |x| val += x }.first(4) # => [0, 1, 3, 6]
val = 0
p [0] + (1..).lazy.map { |x| val += x }.first(3) # => [0, 1, 3, 6]
Updated by shyouhei (Shyouhei Urabe) over 4 years ago
It might be possible to let inject be lazy if we ignore backwards compatibility. But how do we partially evaluate that lazy enumerator then?
Updated by parker (Parker Finch) over 4 years ago
shyouhei (Shyouhei Urabe) wrote in #note-10:
It might be possible to let inject be lazy if we ignore backwards compatibility. But how do we partially evaluate that lazy enumerator then?
I think this is the crux of the issue. Because #inject
needs to evaluate every element in order to return a meaningful value, it can't be partially evaluated. The "scan" operation allows for partial evaluation.
Updated by parker (Parker Finch) over 4 years ago
Eregon (Benoit Daloze) wrote in #note-6:
Maybe
prefix_sum
or justprefix
or something like that would work?
Havingsum
in it is kind of confusing though as it can be any "operation" not just+
-ing numbers, but it seems an official "term" for it (https://en.wikipedia.org/wiki/Prefix_sum#Scan_higher_order_function).
I agree that including "sum" in the name is confusing. I think that "prefix_sum" is just used to describe the sum operation, and if it is generalized to other operations then it is typically called "scan".
Eregon (Benoit Daloze) wrote in #note-9:
Using the existing lazy
map
seems most natural to me:val = 0 p (0..).lazy.map { |x| val += x }.first(4) # => [0, 1, 3, 6] val = 0 p [0] + (1..).lazy.map { |x| val += x }.first(3) # => [0, 1, 3, 6]
It's definitely possible to use map
! I think it is simpler to use a scan, especially if the first element needs to be included in the lazy enumerator. For example:
(1..).lazy.scan(0, &:+)
would need to be:
val = 0
[val].chain((1..).lazy.map { |x| val += x })
because the +
method doesn't work with lazy enumerators.
Updated by Eregon (Benoit Daloze) over 4 years ago
For the name I think just scan
would be best then.
And accept the fact it's completely unrelated to String#scan
.
Updated by mame (Yusuke Endoh) over 4 years ago
Is this what you want?
irb(main):001:0> (1..).lazy.enum_for(:inject, 0).map {|a, b| a + b }.take(10).force
=> [1, 3, 6, 10, 15, 21, 28, 36, 45, 55]
Updated by parker (Parker Finch) over 4 years ago
mame (Yusuke Endoh) wrote in #note-14:
Is this what you want?
irb(main):001:0> (1..).lazy.enum_for(:inject, 0).map {|a, b| a + b }.take(10).force => [1, 3, 6, 10, 15, 21, 28, 36, 45, 55]
Oh interesting, I hadn't considered that approach! That is very close to the behavior of the scan operation, and might be a good way to implement it. (The only difference in behavior I see is that the first element (0
) is not included in the resulting enumerator with this enum_for
approach.)
Are you suggesting that we should use that approach instead of implementing a built-in scan
method? Or is the example to clarify what the behavior of the scan
method would be?
Updated by nobu (Nobuyoshi Nakada) over 4 years ago
parker (Parker Finch) wrote in #note-15:
Are you suggesting that we should use that approach instead of implementing a built-in
scan
method? Or is the example to clarify what the behavior of thescan
method would be?
As it came from the lazy inject
approach, the former.
Updated by matz (Yukihiro Matsumoto) over 4 years ago
I don't see any real-world use-case for scan_left
. Is there any?
In addition, the term scan
does not seem to describe the behavior of keeping the past sequence of accumulation.
Although I know the name from the Haskell community, I don't agree with the name.
Matz.
Updated by parker (Parker Finch) over 4 years ago
matz (Yukihiro Matsumoto) wrote in #note-17:
I don't see any real-world use-case for
scan_left
. Is there any?
I think there are real-world use cases!
Would you consider converting a history of transactions into a history of account balances a valid use-case? That can be done easily with a scan. For example, if you have transactions = [100, -200, 200]
then you can find the history of account balances with transactions.scan_left(0, &:+) # => [0, 100, -100, 100]
.
I have described our current use case of scan_left
here. We use an event-sourced architecture where we scan over a lazy stream of events, building up the state as we go. It is important for our use case that we are able to access past states in addition to the final state.
This post shows how it can make repeated subarray operations more efficient -- the example there is that if you have an array of values representing changes over time periods, then you can easily aggregate those into changes over different time periods using a scan
. This post does not use scan
, but instead has a workaround because scan
doesn't exist:
sums = [0]
(1..gains.length).each do |i|
sums[i] = sums[i - 1] + gains[i - 1]
end
could, if scan
was introduced, be replaced with:
sums = gains.scan_left(0, &:+)
Do those use cases seem sufficient?
In addition, the term
scan
does not seem to describe the behavior of keeping the past sequence of accumulation.
Although I know the name from the Haskell community, I don't agree with the name.
I agree the name is not ideal. It is easy to get it confused with the idea of a string scanner. As you mentioned, scan
is the name used by Haskell, but it is also used by Scala, Rust, and C++. So it is a widely-used term, even if it's not the best one, which makes me think that it might be good to use it.
Alternatively, what do you think about the name accumulate
, which Wolfram uses? I think it gets the idea across better than scan
.
Are there other names you think should be considered?
Updated by RubyBugs (A Nonymous) over 4 years ago
parker (Parker Finch) wrote in #note-18:
Are there other names you think should be considered?
In keeping with the Ruby-ish collection methods that end with "-ect", how about
-
reflect
-- the idea is to contrast withinject
, this "reflects" all intermediate states -
project
-- the idea is that the originalEnumerable
is "projected" in a mathematical sense into the plane defined by the stateful function that is passed in
(just my 2 cents...)
Updated by nobu (Nobuyoshi Nakada) over 4 years ago
With https://github.com/ruby/ruby/pull/3337, you can
(1..).lazy.inject(0, :+).first(10) #=> [0, 1, 3, 6, 10, 15, 21, 28, 36, 45]
Updated by parker (Parker Finch) over 4 years ago
nobu (Nobuyoshi Nakada) wrote in #note-20:
With https://github.com/ruby/ruby/pull/3337, you can
(1..).lazy.inject(0, :+).first(10) #=> [0, 1, 3, 6, 10, 15, 21, 28, 36, 45]
Thank you for writing that nobu! That is exactly the behavior we need.
I'm worried about the backwards compatibility of changing the behavior of #inject
on lazy enumerables. Since right now [1,2,3].lazy.inject(:+) # => 6
I think it would be breaking to change that behavior to be [1,2,3].lazy.inject(:+) # => #<Enumerator::Lazy: #<Enumerator::Lazy: [1, 2, 3]>:inject>
.
I'm also worried about #inject
having different behavior for strict and lazy enumerables. I don't think it would be good to have [1,2,3].inject(:+)
be different than [1,2,3].lazy.inject(:+).force
.
Do you think your implementation could be used for a new method, whether we call it scan
or accumulate
or project
or something else? I think it would be good to have consistent behavior between strict and lazy enumerables.
NOTE: For anyone who is newer to this thread and might have missed it -- you can find this behavior, with some examples, implemented in Ruby here.
Updated by y.annikov (Yakov Annikov) over 4 years ago
I've created a PR with the implementation of scan
and lazy.scan
methods https://github.com/ruby/ruby/pull/3358
Feel free to reject it but any comments are appreciated. I got a lot of fun writing this code and looking at Ruby source code.
Updated by nobu (Nobuyoshi Nakada) over 4 years ago
RubyBugs (A Nonymous) wrote in #note-19:
In keeping with the Ruby-ish collection methods that end with "-ect", how about
reflect
-- the idea is to contrast withinject
, this "reflects" all intermediate statesproject
-- the idea is that the originalEnumerable
is "projected" in a mathematical sense into the plane defined by the stateful function that is passed in
I'd like that “this "reflects" all intermediate states” part.
Updated by duerst (Martin Dürst) over 4 years ago
parker (Parker Finch) wrote in #note-21:
nobu (Nobuyoshi Nakada) wrote in #note-20:
(1..).lazy.inject(0, :+).first(10) #=> [0, 1, 3, 6, 10, 15, 21, 28, 36, 45]
I'm worried about the backwards compatibility of changing the behavior of
#inject
on lazy enumerables. Since right now[1,2,3].lazy.inject(:+) # => 6
I think it would be breaking to change that behavior to be[1,2,3].lazy.inject(:+) # => #<Enumerator::Lazy: #<Enumerator::Lazy: [1, 2, 3]>:inject>
.I'm also worried about
#inject
having different behavior for strict and lazy enumerables. I don't think it would be good to have[1,2,3].inject(:+)
be different than[1,2,3].lazy.inject(:+).force
.
I fully agree. Lazy variants of methods should only differ with respect to laziness, not anything else.
Updated by duerst (Martin Dürst) over 4 years ago
nobu (Nobuyoshi Nakada) wrote in #note-23:
RubyBugs (A Nonymous) wrote in #note-19:
In keeping with the Ruby-ish collection methods that end with "-ect", how about
reflect
-- the idea is to contrast withinject
, this "reflects" all intermediate statesproject
-- the idea is that the originalEnumerable
is "projected" in a mathematical sense into the plane defined by the stateful function that is passed inI'd like that “this "reflects" all intermediate states” part.
I think this is way too generic. In the same vein, we could call it "return", because it returns all intermediate states.
Also, I think "project" isn't appropriate, because project is usually associated with a dimension reduction.
The most specific word in the above explanation is "intermediate". I suggest we search more along these lines. An example would be something like "inject_with_intermediates".
BTW, I also checked APL, where '' is used for what we are discussing here, and is called scan. The fact that this is included in APL shows that this is in some sense a core operation. It doesn't appear e.g. in a 1972 manual for IBM APL\360 (http://www.softwarepreservation.org/projects/apl/Manuals/APL360UsersManuals), which means it may not have been there from the start. But I found some hints in a newer document (1982, http://www.softwarepreservation.org/projects/apl/Manuals/SharpAPLManualCorrections). My guess is that the name did not come from APL (which is one of the oldest functional programming languages).
Updated by y.annikov (Yakov Annikov) over 4 years ago
y.annikov (Yakov Annikov) wrote in #note-22:
I've created a PR with the implementation of
scan
andlazy.scan
methods https://github.com/ruby/ruby/pull/3358
Feel free to reject it but any comments are appreciated. I got a lot of fun writing this code and looking at Ruby source code.
I renamed it to reflect
for a while though I like scan
more.
Here are examples of the behaviour of reflect
method that I've implemented:
# reflect
(0...0).reflect(:+) => []
(5..10).reflect(:+) => [5, 11, 18, 26, 35, 45]
(5..10).reflect { |sum, n| sum + n } => [5, 11, 18, 26, 35, 45]
(5..10).reflect(1, :*) => [1, 5, 30, 210, 1680, 15120, 151200]
(5..10).reflect(1) { |product, n| product * n } => [1, 5, 30, 210, 1680, 15120, 151200]
# lazy.reflect
(0..Float::INFINITY).lazy.reflect(:+).first(4) => [0, 1, 3, 6]
(0..Float::INFINITY).lazy.reflect(:+).first(4) => [1, 1, 2, 4]
enum = (5..10).lazy.reflect(:+) => #<Enumerator::Lazy: #<Enumerator::Lazy: 5..10>:reflect(:+)>
6.times.map { enum.next } => [5, 11, 18, 26, 35, 45]
enum = (5..10).lazy.reflect(1, :*) => #<Enumerator::Lazy: #<Enumerator::Lazy: 5..10>:reflect(1, :*)>
7.times.map { enum.next } => [1, 5, 30, 210, 1680, 15120, 151200]
UPD: Some fixes according to @duerst (Martin Dürst) comments.
Updated by RubyBugs (A Nonymous) over 4 years ago
To return to naming for a moment. Did I understand an interest may exist in naming that reflects a connection or duality with inject
?
If so, might the scan operation find a better Ruby name in interject?
The idea would be to say to new learners, with zero prior context: “the interject
method has the same form as inject, but returns the stream of intermediate values”?
Updated by duerst (Martin Dürst) over 4 years ago
RubyBugs (A Nonymous) wrote in #note-27:
To return to naming for a moment. Did I understand an interest may exist in naming that reflects a connection or duality with
inject
?
I guess that wouldn't be a bad idea.
If so, might the scan operation find a better Ruby name in interject?
The idea would be to say to new learners, with zero prior context: “the
interject
method has the same form as inject, but returns the stream of intermediate values”?
My most immediate impression of interject
is that it would add something between Array
elements (e.g. [5..8].interject(0)
#→ [5, 0, 6, 0, 7, 0, 8] or so). But the parallel to inject
is appealing. And inject
itself isn't really that much of a descriptive name, either.
We would have to check whether interject
is used in other languages, and for what.
Updated by parker (Parker Finch) over 4 years ago
I'd like to sum up where we're at with this discussion. Let me know if you disagree with my interpretation here!
- There is some support for the idea of adding this method.
- We should avoid changing the behavior of
#inject
on lazy enumerables, since it would be confusing for the lazy version of a method to have different behavior than the strict version. - There is concern around the name of this method. Possibilities that have been discussed are:
-
#scan
: This name is not very intuitive for the operation and also has other meanings. However, it is used in many other programming languages. -
#accumulate
: This would have my vote at the moment. It suggests that something is accumulated throughout the iteration of the collection. However, it is still a very generic term. -
#reflect
,#project
,#interject
: These end in-ect
as many Ruby/Smalltalk methods on collections do. However, they have little meaning related to the operation at hand. I agree that it would be good to show the duality between#inject
and this operation, but I'm concerned about using an unrelated word.
I think the next steps here are to get confirmation from @matz (Yukihiro Matsumoto) as to whether or not he thinks this would be a good method to add to Ruby (I described some use cases in #note-18) and to decide on a name. Does that make sense?
Updated by nobu (Nobuyoshi Nakada) over 4 years ago
I found a word traject, means to transport, transmit, or transpose.
It may (or may not) imply trajectory...
Updated by parker (Parker Finch) over 4 years ago
nobu (Nobuyoshi Nakada) wrote in #note-30:
I found a word traject, means to transport, transmit, or transpose.
It may (or may not) imply trajectory...
That's an interesting word that I don't know. It looks like it's archaic, so I don't think that it has much meaning anymore. That's kind of nice, since there's not a conflicting definition that people will have in their heads. However, it also doesn't have a meaning that describes what the method does.
It does have a nice symmetry with inject
though! Curious if others have thoughts?
Updated by parker (Parker Finch) over 4 years ago
I'd like to propose that we name this method #accumulate
. @matz (Yukihiro Matsumoto) do you think that is an acceptable name?
Updated by Dan0042 (Daniel DeLorme) about 4 years ago
#accumulate
is good, but since this question linked in the OP was asking for the "cumulative sum" ... what about #cumulative
?
[1,2,3].cumulative.sum #=> [1,3,6]
[1,2,3].cumulative.inject(5, :*) #=> [[5], [5,10], [5,10,30]]
[1,2,3].cumulative.select(&:odd?) #=> [[1], [1,3]]
[1,2,3].cumulative.map{ _1 * 3 } #=> [[3], [3,6], [3,6,9]]
Updated by parker (Parker Finch) about 4 years ago
Dan0042 (Daniel DeLorme) wrote in #note-33:
... what about
#cumulative
?
Oh that's interesting! I had leapt straight to verbs since that tends to be the pattern for methods that transform enumerables, but the examples of [1,2,3].cumulative.some_other_method
are compelling and make sense in a "it returns a cumulative enumerator" way.
I don't think it fits quite as well when a block is passed though; [1,2,3].cumulative(0, &:+)
doesn't read as naturally to me as the imperative #accumulate
.
I think either option is good -- although since there's a pattern of enumerable methods being named imperatively (e.g. #map
, #select
, #inject
, #drop
) I still slightly lean toward the #accumulate
option.
Updated by y.annikov (Yakov Annikov) about 4 years ago
parker (Parker Finch) wrote in #note-34:
I still slightly lean toward the
#accumulate
option.
What about #cumulate
? Does it sound natural?
Updated by parker (Parker Finch) almost 4 years ago
y.annikov (Yakov Annikov) wrote in #note-35:
What about
#cumulate
? Does it sound natural?
I'm not as familiar with the verb "cumulate", so it doesn't sound as natural to me as "accumulate". But that being said, I don't think there's a huge difference between the two words, so either one could work!
Updated by parker (Parker Finch) over 3 years ago
I like #accumulate
, and thank you @nobu (Nobuyoshi Nakada) for an implementation here!
Is there anything I can do to help us come to a decision on this one?
Updated by parker (Parker Finch) over 3 years ago
- Subject changed from Enumerable#scan_left to Enumerable#accumulate
- Description updated (diff)
Updated by mame (Yusuke Endoh) over 3 years ago
This ticket was discussed in the dev meeting, but no conclusion was reached.
BTW, Mathematica's "Accumulate" is different from this proposal. It is not general but specific to Plus.
https://reference.wolfram.com/language/ref/Accumulate.html
Accumulate[list] is effectively equivalent to FoldList[Plus,list].
Thus, FoldList is more suitable name than Accumulate in this case.
https://reference.wolfram.com/language/ref/FoldList.html
The use case explained in #18 is not so convincing that it deserves a built-in feature. To make something built-in, one should show that it is so frequently written, and/or that it is difficult to work around. We are not sure that retaining a historical context is a very frequent code pattern (@mrkn (Kenta Murata) said that cumulative sum is actually often used in mathematics, but other use case than cumulative sum was not clear). And Eregon's workaround in #9 looks very simple (maybe even simpler than scan_left).
Updated by duerst (Martin Dürst) over 3 years ago
mame (Yusuke Endoh) wrote in #note-39:
This ticket was discussed in the dev meeting, but no conclusion was reached.
BTW, Mathematica's "Accumulate" is different from this proposal. It is not general but specific to Plus.
https://reference.wolfram.com/language/ref/Accumulate.html
Accumulate[list] is effectively equivalent to FoldList[Plus,list].
That's true.
Thus, FoldList is more suitable name than Accumulate in this case.
I don't thinks FoldList would be appropriate in Ruby. I think if we use that name, it would be fold_list
. But we are not dealing with lists, we are dealing with Arrays and Enumerables. But we wouldn't use fold_array
, and reducing this to fold
only wouldn't be appropriate, because fold is a much more general operation (in general, it's closer to Ruby's inject
).
The use case explained in #18 is not so convincing that it deserves a built-in feature. To make something built-in, one should show that it is so frequently written, and/or that it is difficult to work around. We are not sure that retaining a historical context is a very frequent code pattern (@mrkn (Kenta Murata) said that cumulative sum is actually often used in mathematics, but other use case than cumulative sum was not clear).
Even if it's only cumulative sum, that's still quite useful. And if the cumulative sum is the main use case, accumulate
may not be such a bad name. Even if this were e.g. for products, accumulate
would still work in some sense. Another name might be trace
, i.e. in the sense of tracing the evolution of a calculation.
As mentioned above, that's very relevant in a lazy context. A good example might be a bank account with a stream of transfers.
And Eregon's workaround in #9 looks very simple (maybe even simpler than scan_left).
I agree that this doesn't feel very Ruby-ish. For me it's not so much the assignment in the block, but the initialization of a variable outside the block.
Updated by RubyBugs (A Nonymous) over 3 years ago
Thanks everyone continuing to discuss whether to add this method to the Ruby lazy Enumerable!
In case it is helpful, please permit me to clarify that this method (and the functional programming pattern it represents) is of practical, rather than theoretical benefit.
This method underlies the system presented in the following conference talk ETL and Event Sourcing, which daily rebuilds all system state by re-processing enumerations of the history of data extracted from external systems:
- Slides: https://www.slideshare.net/ms-tg/etl-and-event-sourcing
- Video (part 1): https://www.dropbox.com/s/vibkr2edqmtid9n/047_etl_and_event_sourcing_marc_siegel_panorama_education_part_1.mp4?dl=0
- Video (part 2): https://www.dropbox.com/s/o6bwxymrkmbepgr/048_etl_and_event_sourcing_marc_siegel_panorama_education_part_2.mp4?dl=0
The implementation of this method we use is published here:
- Rubygems: https://rubygems.org/gems/scan_left
- Github: https://github.com/panorama-ed/scan_left/
A blog post introducing the gem and discussing its usage: https://medium.com/building-panorama-education/scan-left-a-lazy-incremental-alternative-to-inject-f6e946f74c00
Sincere apologies if this additional context is redundant or unnecessary. My intent in presenting this context is just to clarify that this is practical Ruby code extracted from a production system, rather than a purely theoretical matter of interest.
Thanks again!