Project

General

Profile

Actions

Feature #17016

open

Enumerable#accumulate

Added by parker (Parker Finch) over 3 years ago. Updated almost 3 years ago.

Status:
Open
Assignee:
-
Target version:
-
[ruby-core:99078]

Description

Proposal

UPDATE: Changed proposed method name from #scan_left to #accumulate.

Add an #accumulate method to Enumerable.

Background

#accumulate is similar to #inject, but it accumulates the partial results that are computed. As a comparison:

[1, 2, 3].inject(0, &:+) => 6
[1, 2, 3].accumulate(0, &:+) => [0, 1, 3, 6]

Notably, the accumulate operation can be done lazily since it doesn't require processing the entire collection before computing a value.

I recently described #accumulate, and its relationship to #inject, more thoroughly in this blog post.

Reasoning

We heavily rely on the accumulate operation. We use an event-sourcing pattern, which means that we are scanning over individual "events" and building up the corresponding state. We rely on the history of states and need to do this lazily (we stream events because they cannot fit in memory). Thus the scan operation is much more applicable than the inject operation.

We suspect that there are many applications that could leverage the scan operation. This question would be more easily answered by #accumulate. It is a natural fit for any application that needs to store the incrementally-computed values of an #inject, and a requirement for an application that needs to use #inject while maintaining laziness.

Implementation

There is a Ruby implementation of this functionality here and an implementation in C here.

Update: @nobu (Nobuyoshi Nakada) has provided an alternative implementation here.

Counterarguments

Introducing a new public method is committing to maintenance going forward and expands the size of the Ruby codebase -- it should not be done lightly. I think that providing the functionality here is worth the tradeoff, but I understand any hesitation to add yet more to Ruby!


Files

scan_left_example.rb (2.93 KB) scan_left_example.rb Small script showing `scan` usage with other approaches to the same behavior parker (Parker Finch), 07/10/2020 08:52 PM

Updated by sawa (Tsuyoshi Sawada) over 3 years ago

What is wrong with the following?

[1, 2, 3].inject([0]){|a, e| a << a.last + e} # => [0, 1, 3, 6]
[1, 2, 3].each_with_object([0]){|e, a| a << a.last + e} # => [0, 1, 3, 6]

Updated by parker (Parker Finch) over 3 years ago

sawa (Tsuyoshi Sawada) wrote in #note-1:

What is wrong with the following?

[1, 2, 3].inject([0]){|a, e| a << a.last + e} # => [0, 1, 3, 6]
[1, 2, 3].each_with_object([0]){|e, a| a << a.last + e} # => [0, 1, 3, 6]

Good question! Using #inject or #each_with_object cannot be done lazily. We need to pass a lazy enumerator after applying the fold-like operation. As a toy example, instead of [1, 2, 3] we can use (1..).lazy:

(1..).lazy.inject([0]){|a, e| a << a.last + e} # => infinite loop
(1..).lazy.each_with_object([0]){|e, a| a << a.last + e} # => infinite loop
(1..).lazy.scan_left(0, &:+) # => Lazy enumerator

Updated by Eregon (Benoit Daloze) over 3 years ago

The name scan seems confusing at least, since it has nothing to do with String#scan.
And *_left has no precedent in Ruby.

It seems it's basically recording the history of each block call (+ the initial value), maybe a name expressing that would be good.

Do you have real-world usages you can show?
It seems kind of niche (seems expensive if the collection is large) and can easily be replaced by each + 2 variables (if not lazy at least).

Updated by parker (Parker Finch) over 3 years ago

The name scan seems confusing at least, since it has nothing to do with String#scan.
And *_left has no precedent in Ruby.

It seems it's basically recording the history of each block call (+ the initial value), maybe a name expressing that would be good.

Ah sorry, I forgot to mention that the term "scan" is what this operation is typically called in functional programming (see here for a quick overview). So that's how the name is derived. I agree that the *_left suffix doesn't sound very Ruby-ish, maybe just #scan would be better. But then we really have a conflict with the name of String#scan, thanks for pointing that out! Any thoughts on what this could be called to disambiguate that? Perhaps #accumulate?

Do you have real-world usages you can show?

The most significant real-world usage that I've had it for is the one I briefly described above. The "scan" operation is a very good fit when processing streams of data changes. To really concretize it, our use case is in education data. One example is that each day we see if a student is present or absent. We use a scan over that data to count the number of absences that a student has -- a fold (i.e. #inject or #reduce) would be insufficient because the stream is lazy.

To get a small, self-contained example, I wrote up a script that uses it (and shows how other implementations of the behavior could work). That is attached, let me know if there's anything else I can do to show the usefulness here!

It seems kind of niche (seems expensive if the collection is large) and can easily be replaced by each + 2 variables (if not lazy at least).

I don't think it's actually that niche! A lot of array algorithms become easier with a scan operation, see this post for an example. And I don't think that it's particularly expensive -- it should be similar in cost to a map.

In fact, scan-like behavior (with laziness) can be implemented with map without too much trouble. (This is the heart of the ruby implementation.) But assigning to a variable inside of the block passed to map doesn't feel very Ruby-ish to me:

val = 0
collection.map { |x| val = val + x }

feels less natural than something like

collection.scan(&:+)

One of my favorite aspects of Ruby is how easy it is to write in a functional programming style, and including the scan operation would expand the number of use cases covered by functional methods.

Updated by nobu (Nobuyoshi Nakada) over 3 years ago

parker (Parker Finch) wrote in #note-2:

Good question! Using #inject or #each_with_object cannot be done lazily. We need to pass a lazy enumerator after applying the fold-like operation. As a toy example, instead of [1, 2, 3] we can use (1..).lazy:

(1..).lazy.inject([0]){|a, e| a << a.last + e} # => infinite loop
(1..).lazy.each_with_object([0]){|e, a| a << a.last + e} # => infinite loop
(1..).lazy.scan_left(0, &:+) # => Lazy enumerator

That sounds like Enumerator::Lazy needs #inject and #each_with_object methods to me.

Updated by Eregon (Benoit Daloze) over 3 years ago

parker (Parker Finch) wrote in #note-4:

Thanks for your reply, I think it will help to discuss this issue at the dev meeting.

Maybe prefix_sum or just prefix or something like that would work?
Having sum in it is kind of confusing though as it can be any "operation" not just +-ing numbers, but it seems an official "term" for it (https://en.wikipedia.org/wiki/Prefix_sum#Scan_higher_order_function).

But assigning to a variable inside of the block passed to map doesn't feel very Ruby-ish to me:

val = 0
collection.map { |x| val = val + x }

It's just my opinion, but I see nothing wrong with that (details: it could be val += x).
I'd even go as far as saying each_with_object is often less readable than using a captured variable.
I think "purely functional, not a single re-assigned variable" often introduces significant extra complexity, when Ruby is a language that embraces both functional and imperative programming.
And anyway each_with_object is only useful if mutating some object (typically an Array or Hash).
Again, it's just my opinion :)

Updated by Eregon (Benoit Daloze) over 3 years ago

nobu (Nobuyoshi Nakada) wrote in #note-5:

That sounds like Enumerator::Lazy needs #inject and #each_with_object methods to me.

A lazy #inject sounds useful.

Updated by nobu (Nobuyoshi Nakada) over 3 years ago

Eregon (Benoit Daloze) wrote in #note-7:

nobu (Nobuyoshi Nakada) wrote in #note-5:

That sounds like Enumerator::Lazy needs #inject and #each_with_object methods to me.

A lazy #inject sounds useful.

It has a backward incompatibility on the return value.
I’m afraid if it’s acceptable.

Updated by Eregon (Benoit Daloze) over 3 years ago

nobu (Nobuyoshi Nakada) wrote in #note-8:

It has a backward incompatibility on the return value.
I’m afraid if it’s acceptable.

Right, inject might return anything, not necessarily an Array and so returning a "lazy Enumerator" instead will be unwanted in at least some cases.

Using the existing lazy map seems most natural to me:

val = 0
p (0..).lazy.map { |x| val += x }.first(4) # => [0, 1, 3, 6]

val = 0
p [0] + (1..).lazy.map { |x| val += x }.first(3) # => [0, 1, 3, 6]

Updated by shyouhei (Shyouhei Urabe) over 3 years ago

It might be possible to let inject be lazy if we ignore backwards compatibility. But how do we partially evaluate that lazy enumerator then?

Updated by parker (Parker Finch) over 3 years ago

shyouhei (Shyouhei Urabe) wrote in #note-10:

It might be possible to let inject be lazy if we ignore backwards compatibility. But how do we partially evaluate that lazy enumerator then?

I think this is the crux of the issue. Because #inject needs to evaluate every element in order to return a meaningful value, it can't be partially evaluated. The "scan" operation allows for partial evaluation.

Updated by parker (Parker Finch) over 3 years ago

Eregon (Benoit Daloze) wrote in #note-6:

Maybe prefix_sum or just prefix or something like that would work?
Having sum in it is kind of confusing though as it can be any "operation" not just +-ing numbers, but it seems an official "term" for it (https://en.wikipedia.org/wiki/Prefix_sum#Scan_higher_order_function).

I agree that including "sum" in the name is confusing. I think that "prefix_sum" is just used to describe the sum operation, and if it is generalized to other operations then it is typically called "scan".


Eregon (Benoit Daloze) wrote in #note-9:

Using the existing lazy map seems most natural to me:

val = 0
p (0..).lazy.map { |x| val += x }.first(4) # => [0, 1, 3, 6]

val = 0
p [0] + (1..).lazy.map { |x| val += x }.first(3) # => [0, 1, 3, 6]

It's definitely possible to use map! I think it is simpler to use a scan, especially if the first element needs to be included in the lazy enumerator. For example:

(1..).lazy.scan(0, &:+)

would need to be:

val = 0
[val].chain((1..).lazy.map { |x| val += x })

because the + method doesn't work with lazy enumerators.

Updated by Eregon (Benoit Daloze) over 3 years ago

For the name I think just scan would be best then.
And accept the fact it's completely unrelated to String#scan.

Updated by mame (Yusuke Endoh) over 3 years ago

Is this what you want?

irb(main):001:0> (1..).lazy.enum_for(:inject, 0).map {|a, b| a + b }.take(10).force
=> [1, 3, 6, 10, 15, 21, 28, 36, 45, 55]

Updated by parker (Parker Finch) over 3 years ago

mame (Yusuke Endoh) wrote in #note-14:

Is this what you want?

irb(main):001:0> (1..).lazy.enum_for(:inject, 0).map {|a, b| a + b }.take(10).force
=> [1, 3, 6, 10, 15, 21, 28, 36, 45, 55]

Oh interesting, I hadn't considered that approach! That is very close to the behavior of the scan operation, and might be a good way to implement it. (The only difference in behavior I see is that the first element (0) is not included in the resulting enumerator with this enum_for approach.)

Are you suggesting that we should use that approach instead of implementing a built-in scan method? Or is the example to clarify what the behavior of the scan method would be?

Updated by nobu (Nobuyoshi Nakada) over 3 years ago

parker (Parker Finch) wrote in #note-15:

Are you suggesting that we should use that approach instead of implementing a built-in scan method? Or is the example to clarify what the behavior of the scan method would be?

As it came from the lazy inject approach, the former.

Updated by matz (Yukihiro Matsumoto) over 3 years ago

I don't see any real-world use-case for scan_left. Is there any?
In addition, the term scan does not seem to describe the behavior of keeping the past sequence of accumulation.
Although I know the name from the Haskell community, I don't agree with the name.

Matz.

Updated by parker (Parker Finch) over 3 years ago

matz (Yukihiro Matsumoto) wrote in #note-17:

I don't see any real-world use-case for scan_left. Is there any?

I think there are real-world use cases!

Would you consider converting a history of transactions into a history of account balances a valid use-case? That can be done easily with a scan. For example, if you have transactions = [100, -200, 200] then you can find the history of account balances with transactions.scan_left(0, &:+) # => [0, 100, -100, 100].

I have described our current use case of scan_left here. We use an event-sourced architecture where we scan over a lazy stream of events, building up the state as we go. It is important for our use case that we are able to access past states in addition to the final state.

This post shows how it can make repeated subarray operations more efficient -- the example there is that if you have an array of values representing changes over time periods, then you can easily aggregate those into changes over different time periods using a scan. This post does not use scan, but instead has a workaround because scan doesn't exist:

sums = [0]
(1..gains.length).each do |i|
  sums[i] = sums[i - 1] + gains[i - 1]
end

could, if scan was introduced, be replaced with:

sums = gains.scan_left(0, &:+)

Do those use cases seem sufficient?


In addition, the term scan does not seem to describe the behavior of keeping the past sequence of accumulation.
Although I know the name from the Haskell community, I don't agree with the name.

I agree the name is not ideal. It is easy to get it confused with the idea of a string scanner. As you mentioned, scan is the name used by Haskell, but it is also used by Scala, Rust, and C++. So it is a widely-used term, even if it's not the best one, which makes me think that it might be good to use it.

Alternatively, what do you think about the name accumulate, which Wolfram uses? I think it gets the idea across better than scan.

Are there other names you think should be considered?

Updated by RubyBugs (A Nonymous) over 3 years ago

parker (Parker Finch) wrote in #note-18:

Are there other names you think should be considered?

In keeping with the Ruby-ish collection methods that end with "-ect", how about

  • reflect -- the idea is to contrast with inject, this "reflects" all intermediate states
  • project -- the idea is that the original Enumerable is "projected" in a mathematical sense into the plane defined by the stateful function that is passed in

(just my 2 cents...)

Updated by nobu (Nobuyoshi Nakada) over 3 years ago

With https://github.com/ruby/ruby/pull/3337, you can

(1..).lazy.inject(0, :+).first(10) #=> [0, 1, 3, 6, 10, 15, 21, 28, 36, 45]

Updated by parker (Parker Finch) over 3 years ago

nobu (Nobuyoshi Nakada) wrote in #note-20:

With https://github.com/ruby/ruby/pull/3337, you can

(1..).lazy.inject(0, :+).first(10) #=> [0, 1, 3, 6, 10, 15, 21, 28, 36, 45]

Thank you for writing that nobu! That is exactly the behavior we need.

I'm worried about the backwards compatibility of changing the behavior of #inject on lazy enumerables. Since right now [1,2,3].lazy.inject(:+) # => 6 I think it would be breaking to change that behavior to be [1,2,3].lazy.inject(:+) # => #<Enumerator::Lazy: #<Enumerator::Lazy: [1, 2, 3]>:inject>.

I'm also worried about #inject having different behavior for strict and lazy enumerables. I don't think it would be good to have [1,2,3].inject(:+) be different than [1,2,3].lazy.inject(:+).force.

Do you think your implementation could be used for a new method, whether we call it scan or accumulate or project or something else? I think it would be good to have consistent behavior between strict and lazy enumerables.


NOTE: For anyone who is newer to this thread and might have missed it -- you can find this behavior, with some examples, implemented in Ruby here.

Updated by y.annikov (Yakov Annikov) over 3 years ago

I've created a PR with the implementation of scan and lazy.scan methods https://github.com/ruby/ruby/pull/3358
Feel free to reject it but any comments are appreciated. I got a lot of fun writing this code and looking at Ruby source code.

Updated by nobu (Nobuyoshi Nakada) over 3 years ago

RubyBugs (A Nonymous) wrote in #note-19:

In keeping with the Ruby-ish collection methods that end with "-ect", how about

  • reflect -- the idea is to contrast with inject, this "reflects" all intermediate states
  • project -- the idea is that the original Enumerable is "projected" in a mathematical sense into the plane defined by the stateful function that is passed in

I'd like that “this "reflects" all intermediate states” part.

Updated by duerst (Martin Dürst) over 3 years ago

parker (Parker Finch) wrote in #note-21:

nobu (Nobuyoshi Nakada) wrote in #note-20:

(1..).lazy.inject(0, :+).first(10) #=> [0, 1, 3, 6, 10, 15, 21, 28, 36, 45]

I'm worried about the backwards compatibility of changing the behavior of #inject on lazy enumerables. Since right now [1,2,3].lazy.inject(:+) # => 6 I think it would be breaking to change that behavior to be [1,2,3].lazy.inject(:+) # => #<Enumerator::Lazy: #<Enumerator::Lazy: [1, 2, 3]>:inject>.

I'm also worried about #inject having different behavior for strict and lazy enumerables. I don't think it would be good to have [1,2,3].inject(:+) be different than [1,2,3].lazy.inject(:+).force.

I fully agree. Lazy variants of methods should only differ with respect to laziness, not anything else.

Updated by duerst (Martin Dürst) over 3 years ago

nobu (Nobuyoshi Nakada) wrote in #note-23:

RubyBugs (A Nonymous) wrote in #note-19:

In keeping with the Ruby-ish collection methods that end with "-ect", how about

  • reflect -- the idea is to contrast with inject, this "reflects" all intermediate states
  • project -- the idea is that the original Enumerable is "projected" in a mathematical sense into the plane defined by the stateful function that is passed in

I'd like that “this "reflects" all intermediate states” part.

I think this is way too generic. In the same vein, we could call it "return", because it returns all intermediate states.
Also, I think "project" isn't appropriate, because project is usually associated with a dimension reduction.

The most specific word in the above explanation is "intermediate". I suggest we search more along these lines. An example would be something like "inject_with_intermediates".

BTW, I also checked APL, where '' is used for what we are discussing here, and is called scan. The fact that this is included in APL shows that this is in some sense a core operation. It doesn't appear e.g. in a 1972 manual for IBM APL\360 (http://www.softwarepreservation.org/projects/apl/Manuals/APL360UsersManuals), which means it may not have been there from the start. But I found some hints in a newer document (1982, http://www.softwarepreservation.org/projects/apl/Manuals/SharpAPLManualCorrections). My guess is that the name did not come from APL (which is one of the oldest functional programming languages).

Updated by y.annikov (Yakov Annikov) over 3 years ago

y.annikov (Yakov Annikov) wrote in #note-22:

I've created a PR with the implementation of scan and lazy.scan methods https://github.com/ruby/ruby/pull/3358
Feel free to reject it but any comments are appreciated. I got a lot of fun writing this code and looking at Ruby source code.

I renamed it to reflect for a while though I like scan more.
Here are examples of the behaviour of reflect method that I've implemented:

# reflect
(0...0).reflect(:+) => []
(5..10).reflect(:+) => [5, 11, 18, 26, 35, 45]
(5..10).reflect { |sum, n| sum + n } => [5, 11, 18, 26, 35, 45]
(5..10).reflect(1, :*) => [1, 5, 30, 210, 1680, 15120, 151200]
(5..10).reflect(1) { |product, n| product * n } => [1, 5, 30, 210, 1680, 15120, 151200]
# lazy.reflect
(0..Float::INFINITY).lazy.reflect(:+).first(4) => [0, 1, 3, 6]
(0..Float::INFINITY).lazy.reflect(:+).first(4) => [1, 1, 2, 4]
enum = (5..10).lazy.reflect(:+) => #<Enumerator::Lazy: #<Enumerator::Lazy: 5..10>:reflect(:+)>
6.times.map { enum.next } => [5, 11, 18, 26, 35, 45]
enum = (5..10).lazy.reflect(1, :*) => #<Enumerator::Lazy: #<Enumerator::Lazy: 5..10>:reflect(1, :*)>
7.times.map { enum.next } => [1, 5, 30, 210, 1680, 15120, 151200]

UPD: Some fixes according to @duerst (Martin Dürst) comments.

Updated by RubyBugs (A Nonymous) over 3 years ago

To return to naming for a moment. Did I understand an interest may exist in naming that reflects a connection or duality with inject?

If so, might the scan operation find a better Ruby name in interject?

The idea would be to say to new learners, with zero prior context: “the interject method has the same form as inject, but returns the stream of intermediate values”?

Updated by duerst (Martin Dürst) over 3 years ago

RubyBugs (A Nonymous) wrote in #note-27:

To return to naming for a moment. Did I understand an interest may exist in naming that reflects a connection or duality with inject?

I guess that wouldn't be a bad idea.

If so, might the scan operation find a better Ruby name in interject?

The idea would be to say to new learners, with zero prior context: “the interject method has the same form as inject, but returns the stream of intermediate values”?

My most immediate impression of interject is that it would add something between Array elements (e.g. [5..8].interject(0) #→ [5, 0, 6, 0, 7, 0, 8] or so). But the parallel to inject is appealing. And inject itself isn't really that much of a descriptive name, either.

We would have to check whether interject is used in other languages, and for what.

Updated by parker (Parker Finch) over 3 years ago

I'd like to sum up where we're at with this discussion. Let me know if you disagree with my interpretation here!

  1. There is some support for the idea of adding this method.
  2. We should avoid changing the behavior of #inject on lazy enumerables, since it would be confusing for the lazy version of a method to have different behavior than the strict version.
  3. There is concern around the name of this method. Possibilities that have been discussed are:
  • #scan: This name is not very intuitive for the operation and also has other meanings. However, it is used in many other programming languages.
  • #accumulate: This would have my vote at the moment. It suggests that something is accumulated throughout the iteration of the collection. However, it is still a very generic term.
  • #reflect, #project, #interject: These end in -ect as many Ruby/Smalltalk methods on collections do. However, they have little meaning related to the operation at hand. I agree that it would be good to show the duality between #inject and this operation, but I'm concerned about using an unrelated word.

I think the next steps here are to get confirmation from @matz (Yukihiro Matsumoto) as to whether or not he thinks this would be a good method to add to Ruby (I described some use cases in #note-18) and to decide on a name. Does that make sense?

Updated by nobu (Nobuyoshi Nakada) over 3 years ago

I found a word traject, means to transport, transmit, or transpose.
It may (or may not) imply trajectory...

Updated by parker (Parker Finch) over 3 years ago

nobu (Nobuyoshi Nakada) wrote in #note-30:

I found a word traject, means to transport, transmit, or transpose.
It may (or may not) imply trajectory...

That's an interesting word that I don't know. It looks like it's archaic, so I don't think that it has much meaning anymore. That's kind of nice, since there's not a conflicting definition that people will have in their heads. However, it also doesn't have a meaning that describes what the method does.

It does have a nice symmetry with inject though! Curious if others have thoughts?

Updated by parker (Parker Finch) over 3 years ago

I'd like to propose that we name this method #accumulate. @matz (Yukihiro Matsumoto) do you think that is an acceptable name?

Updated by Dan0042 (Daniel DeLorme) over 3 years ago

#accumulate is good, but since this question linked in the OP was asking for the "cumulative sum" ... what about #cumulative ?

[1,2,3].cumulative.sum            #=> [1,3,6]
[1,2,3].cumulative.inject(5, :*)  #=> [[5], [5,10], [5,10,30]]
[1,2,3].cumulative.select(&:odd?) #=> [[1], [1,3]]
[1,2,3].cumulative.map{ _1 * 3 }  #=> [[3], [3,6], [3,6,9]]

Updated by parker (Parker Finch) over 3 years ago

Dan0042 (Daniel DeLorme) wrote in #note-33:

... what about #cumulative ?

Oh that's interesting! I had leapt straight to verbs since that tends to be the pattern for methods that transform enumerables, but the examples of [1,2,3].cumulative.some_other_method are compelling and make sense in a "it returns a cumulative enumerator" way.

I don't think it fits quite as well when a block is passed though; [1,2,3].cumulative(0, &:+) doesn't read as naturally to me as the imperative #accumulate.

I think either option is good -- although since there's a pattern of enumerable methods being named imperatively (e.g. #map, #select, #inject, #drop) I still slightly lean toward the #accumulate option.

Updated by y.annikov (Yakov Annikov) over 3 years ago

parker (Parker Finch) wrote in #note-34:

I still slightly lean toward the #accumulate option.

What about #cumulate? Does it sound natural?

Updated by parker (Parker Finch) over 3 years ago

y.annikov (Yakov Annikov) wrote in #note-35:

What about #cumulate? Does it sound natural?

I'm not as familiar with the verb "cumulate", so it doesn't sound as natural to me as "accumulate". But that being said, I don't think there's a huge difference between the two words, so either one could work!

Updated by parker (Parker Finch) almost 3 years ago

I like #accumulate, and thank you @nobu (Nobuyoshi Nakada) for an implementation here!

Is there anything I can do to help us come to a decision on this one?

Actions #38

Updated by parker (Parker Finch) almost 3 years ago

  • Subject changed from Enumerable#scan_left to Enumerable#accumulate
  • Description updated (diff)

Updated by mame (Yusuke Endoh) almost 3 years ago

This ticket was discussed in the dev meeting, but no conclusion was reached.

BTW, Mathematica's "Accumulate" is different from this proposal. It is not general but specific to Plus.

https://reference.wolfram.com/language/ref/Accumulate.html

Accumulate[list] is effectively equivalent to FoldList[Plus,list].

Thus, FoldList is more suitable name than Accumulate in this case.

https://reference.wolfram.com/language/ref/FoldList.html

The use case explained in #18 is not so convincing that it deserves a built-in feature. To make something built-in, one should show that it is so frequently written, and/or that it is difficult to work around. We are not sure that retaining a historical context is a very frequent code pattern (@mrkn (Kenta Murata) said that cumulative sum is actually often used in mathematics, but other use case than cumulative sum was not clear). And Eregon's workaround in #9 looks very simple (maybe even simpler than scan_left).

Updated by duerst (Martin Dürst) almost 3 years ago

mame (Yusuke Endoh) wrote in #note-39:

This ticket was discussed in the dev meeting, but no conclusion was reached.

BTW, Mathematica's "Accumulate" is different from this proposal. It is not general but specific to Plus.

https://reference.wolfram.com/language/ref/Accumulate.html

Accumulate[list] is effectively equivalent to FoldList[Plus,list].

That's true.

Thus, FoldList is more suitable name than Accumulate in this case.

https://reference.wolfram.com/language/ref/FoldList.html

I don't thinks FoldList would be appropriate in Ruby. I think if we use that name, it would be fold_list. But we are not dealing with lists, we are dealing with Arrays and Enumerables. But we wouldn't use fold_array, and reducing this to fold only wouldn't be appropriate, because fold is a much more general operation (in general, it's closer to Ruby's inject).

The use case explained in #18 is not so convincing that it deserves a built-in feature. To make something built-in, one should show that it is so frequently written, and/or that it is difficult to work around. We are not sure that retaining a historical context is a very frequent code pattern (@mrkn (Kenta Murata) said that cumulative sum is actually often used in mathematics, but other use case than cumulative sum was not clear).

Even if it's only cumulative sum, that's still quite useful. And if the cumulative sum is the main use case, accumulate may not be such a bad name. Even if this were e.g. for products, accumulate would still work in some sense. Another name might be trace, i.e. in the sense of tracing the evolution of a calculation.

As mentioned above, that's very relevant in a lazy context. A good example might be a bank account with a stream of transfers.

And Eregon's workaround in #9 looks very simple (maybe even simpler than scan_left).

I agree that this doesn't feel very Ruby-ish. For me it's not so much the assignment in the block, but the initialization of a variable outside the block.

Updated by RubyBugs (A Nonymous) almost 3 years ago

Thanks everyone continuing to discuss whether to add this method to the Ruby lazy Enumerable!

In case it is helpful, please permit me to clarify that this method (and the functional programming pattern it represents) is of practical, rather than theoretical benefit.

This method underlies the system presented in the following conference talk ETL and Event Sourcing, which daily rebuilds all system state by re-processing enumerations of the history of data extracted from external systems:

The implementation of this method we use is published here:

A blog post introducing the gem and discussing its usage: https://medium.com/building-panorama-education/scan-left-a-lazy-incremental-alternative-to-inject-f6e946f74c00

Sincere apologies if this additional context is redundant or unnecessary. My intent in presenting this context is just to clarify that this is practical Ruby code extracted from a production system, rather than a purely theoretical matter of interest.

Thanks again!

Actions

Also available in: Atom PDF

Like1
Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0