Feature #6284

Add composition for procs

Added by Pablo Herrero over 3 years ago. Updated about 1 month ago.

[ruby-core:44303]
Status:Feedback
Priority:Normal
Assignee:Yukihiro Matsumoto

Description

=begin
It would be nice to be able to compose procs like functions in functional programming languages:

to_camel = :capitalize.to_proc
add_header = ->val {"Title: " + val}

format_as_title = add_header << to_camel << :strip

instead of:

format_as_title = lambda {|val| "Title: " + val.strip.capitalize }

It's pretty easy to implement in pure ruby:

class Proc
def << block
proc { |args| self.call( block.to_proc.call(args) ) }
end
end
=end

0002-proc.c-Implement-Method-for-Method-composition.patch Magnifier (2.71 KB) Paul Mucur, 06/14/2015 04:52 PM

0001-proc.c-Implement-Proc-for-Proc-composition.patch Magnifier (3.73 KB) Paul Mucur, 06/14/2015 04:52 PM

0003-proc.c-Support-any-callable-when-composing-Procs.patch Magnifier (4.01 KB) Paul Mucur, 06/23/2015 02:45 PM

History

#1 Updated by Thomas Sawyer over 3 years ago

=begin
Or

format_as_title = ->{ |val| add_header[to_camel[val.strip]] }

=end

#2 Updated by Thomas Sawyer over 3 years ago

Also, I think #+ is better.

#3 Updated by Pablo Herrero over 3 years ago

trans (Thomas Sawyer) wrote:

Also, I think #+ is better.

I saw facets has some similar feature that uses #* instead, maybe because it looks a bit closer to Haskell's composition syntax. Nevertheless, I still like #<< better, it feels your are "connecting" the blocks together.

#4 Updated by Alexey Muranov over 3 years ago

I would vote for #*. I think #<< is usually changing the left argument (in place).

#6 Updated by Yusuke Endoh over 3 years ago

  • Status changed from Open to Assigned
  • Assignee set to Yukihiro Matsumoto

#7 Updated by Pablo Herrero over 3 years ago

=begin

aprescott (Adam Prescott) wrote:

See also: ((URL:http://web.archive.org/web/20101228224741/http://drmcawesome.com/FunctionCompositionInRuby))

Maybe #| could be a possibility. (Without implementing #> or #<).

But I find the article's proposition about the chaining order a bit missleading:

transform = add1 | sub3 | negate

For me that feels more like "piping" ((|add1|)) to ((|sub3|)) to ((|negate|)), from left to right, not the other way around.

If we choose to take that path I think the following code would be a plausible implementation:

class Proc
  def | block
    proc { |*args| block.to_proc.call( self.call(*args) ) }
  end
end

class Symbol
  def | block
    self.to_proc | block
  end
end

=end

#8 Updated by Alexey Muranov over 3 years ago

What about #* for composing traditionally (right to left) and #| for piping (left to right)? In mathematics, depending of the area and the subject, both ways are used, and some argue that "piping" is more natural than "precomposing". However, when functions are "piped", the arguments are usually on the left: (arguments)(function1 function2).

Update: i think having the both was a bad idea, it would be redundant.

#9 Updated by Thomas Sawyer over 3 years ago

I agree, #* is appropriate for composition.

#10 Updated by Pablo Herrero over 3 years ago

alexeymuranov (Alexey Muranov) wrote:

Update: i think having the both was a bad idea, it would be redundant.

I was going to say the same thing. Having both #* and #| is redundant and also a bit confusing, since #| doesn't really feel to be the opposite operation of #* at any context. We should choose one or the other but not both.
I still like #| (chaining from left to right) a bit better, but I rather have #* than nothing.

#11 Updated by Yukihiro Matsumoto almost 3 years ago

  • Status changed from Assigned to Feedback

Positive about adding function composition. But we need method name consensus before adding it?
Is #* OK for everyone?

Matz.

#12 Updated by Joshua Ballanco over 2 years ago

Might I humbly suggest #<- :

to_camel = :capitalize.to_proc
add_header = ->val {"Title: " + val}

format_as_title = add_header <- to_camel <- :strip

Seems to have a nice symmetry with #->

#13 Updated by Rodrigo Rosenfeld Rosas over 2 years ago

I think "<-" reads better but I'm ok with '*' as well.

#14 Updated by Rohit Arondekar over 2 years ago

I'm with Joshua, I think #<- reads a lot better.

#15 Updated by Alexey Muranov over 2 years ago

=begin
I think that the meaning of (({#<-})) would not be symmetric with the meaning of (({#->})).

Also, in mathematics, arrows are more like relations than operations. When used to describe functions, usually function arguments go to the arrow's tail, function values to the arrow's head, and function's name, for example, goes on top of the arrow.
(In this sense Ruby's lambda syntax would look to me more natural in the form (({f = (a,b)->{ a + b }})) instead of (({f = ->(a,b){ a + b }})).)

The main drawback of #* in my opinion is that is does not specify the direction of composition ((f*g)(x) is f(g(x)) or g(f(x))?), but since in Ruby function arguments are written on the right ((({f(g(x))}))), i think it can be assumed that the inner function is on the right and the outer is on the left.

((Update)) : Just for reference, here is how it is done in Haskell : http://www.haskell.org/haskellwiki/Function_composition
=end

#16 Updated by Marc-Andre Lafortune over 2 years ago

+1 for #*

The symbol used in mathematics for function composition is a circle (∘); the arrows are for the definitions of functions (like lambdas) only, so #<- or whatever make no sense to me.

Finally, the f ∘ g(x) is defined as f(g(x)), so there is no argument there either.

#17 Updated by Martin Dürst over 2 years ago

marcandre (Marc-Andre Lafortune) wrote:

+1 for #*

The symbol used in mathematics for function composition is a circle (∘); the arrows are for the definitions of functions (like lambdas) only, so #<- or whatever make no sense to me.

Very good point.

Finally, the f ∘ g(x) is defined as f(g(x)), so there is no argument there either.

Not true. Depending on which field of mathematics you look at, either (f ∘ g)(x) is either f(g(x)), or it is g(f(x)). The later is in particular true in work involving relations, see e.g. http://en.wikipedia.org/wiki/Composition_of_relations#Definition.

Speaking from a more programming-related viewpoint, f(g(x)) is what is used e.g. in Haskell, and probably in many other functional languages, and so may be familiar with many programmers.

However, we should take into account that a functional language writes e.g. reverse(sort(array)), so it makes sense to define revsort = reverse * sort (i.e. (f ∘ g)(x) is f(g(x))). But in Ruby, it would be array.sort.reverse, so revsort = sort * reverseve may feel much more natural (i.e. (f ∘ g)(x) is g(f(x))).

#18 Updated by Matthew Kerwin over 2 years ago

I agree that (f ∘ g)(x) is g(f(x)) is more intuitive from a purely
programmatic point of view. It is "natural" for the operations to be
applied left to right, exactly like method chaining.

On 10 November 2012 13:06, duerst (Martin Dürst) duerst@it.aoyama.ac.jpwrote:

Issue #6284 has been updated by duerst (Martin Dürst).

marcandre (Marc-Andre Lafortune) wrote:

+1 for #*

The symbol used in mathematics for function composition is a circle (∘);
the arrows are for the definitions of functions (like lambdas) only, so #<-
or whatever make no sense to me.

Very good point.

Finally, the f ∘ g(x) is defined as f(g(x)), so there is no argument
there either.

Not true. Depending on which field of mathematics you look at, either (f ∘
g)(x) is either f(g(x)), or it is g(f(x)). The later is in particular true
in work involving relations, see e.g.
http://en.wikipedia.org/wiki/Composition_of_relations#Definition.

Speaking from a more programming-related viewpoint, f(g(x)) is what is
used e.g. in Haskell, and probably in many other functional languages, and
so may be familiar with many programmers.

However, we should take into account that a functional language writes
e.g. reverse(sort(array)), so it makes sense to define revsort = reverse *
sort (i.e. (f ∘ g)(x) is f(g(x))). But in Ruby, it would be
array.sort.reverse, so revsort = sort * reverseve may feel much more
natural (i.e. (f ∘ g)(x) is g(f(x))).


Feature #6284: Add composition for procs
https://bugs.ruby-lang.org/issues/6284#change-32728

Author: pabloh (Pablo Herrero)
Status: Feedback
Priority: Normal
Assignee: matz (Yukihiro Matsumoto)
Category:
Target version: 2.0.0

=begin
It would be nice to be able to compose procs like functions in functional
programming languages:

to_camel = :capitalize.to_proc
add_header = ->val {"Title: " + val}

format_as_title = add_header << to_camel << :strip

instead of:

format_as_title = lambda {|val| "Title: " + val.strip.capitalize }

It's pretty easy to implement in pure ruby:

class Proc
def << block
proc { |args| self.call( block.to_proc.call(args) ) }
end
end
=end

http://bugs.ruby-lang.org/

--
Matthew Kerwin, B.Sc (CompSci) (Hons)
http://matthew.kerwin.net.au/
ABN: 59-013-727-651

"You'll never find a programming language that frees
you from the burden of clarifying your ideas." - xkcd

#19 Updated by Alexey Muranov over 2 years ago

phluid61 (Matthew Kerwin) wrote:

I agree that (f ∘ g)(x) is g(f(x)) is more intuitive from a purely
programmatic point of view. It is "natural" for the operations to be
applied left to right, exactly like method chaining.

When functions are applied from left to right, the argument is usually (if not always) on the left. The form (x)(fg)=((x)f)g may look awkward (though i personally used it in a math paper), so i think usually the "exponential" notation is preferred: xfg = (xf)g, where xf corresponds to f(x) in the usual notation.

With method chaining, IMO, the "main argument" of a method is the receiver, and it is on the left. Lambdas and Procs are not chained in the same way as method calls.

Update: I agree that the common syntax for calling functions (f(x) rather then (x)f) should not be an obstacle if Ruby decides to consistently multiply functions putting the inner on the left and the outer on the right. Another syntax for calling functions can be invented in the future, or rubists can learn to live with this inconsistency. For example, Ruby (or Matz) can decide to multiply lambdas with the inner on the left and the outer on the right, and add the following syntax:

format_as_title = :strip.to_proc * :capitalize.to_proc * lambda { |val| "Title: " + val }
title = " over here " ^ format_as_title # instead of title = format_as_title.call(" over here ")

Update 2012-11-11: I was not clear what i meant by "multiplying from left to right". I meant to say: putting the inner function on the left and the outer on the right. I am correcting this phrase.

#20 Updated by Rodrigo Rosenfeld Rosas over 2 years ago

In Math multiplication is always associative, even for matrix. I.e: (A*B)C == A(B*C). If we use * for ∘ (composition) it resembles multiplication. Function composition is analog to matrix multiplication which are commonly used for transformation compositions as well. In fact, function composition is also associative.

So, when representing h = f ∘ g as h = f * g it makes sense to me (although Math preferring a different symbol for multiplication and composition is a good indication that we should consider this as well for Ruby - more on that later on). But Math representation is procedural, not object oriented. If we try to mix both approaches to fit Ruby philosophy this could lead to great confusion.

Ruby can be also used for procedural programming:

sqrt = ->(n){ Math.sqrt n } # Although I agree that (n)->{} would read more natural to me, just like in CoffeeScript
square_sum = ->(a, b) { a*a + b*b }
hypotenuse = sqrt * square_sum
5 == hypotenuse.call 3, 4 # equivalent to: sqrt.call square_sum.call 3, 4

This makes total sense to me using procedural notation. I'm not sure how would someone use this using some OO notation instead...

Now with regards to composition notation, I think a different notation could help those reading some code and trying to understand it. Suppose this method:

def bad_name(bad_argument_name, b)
bad_argument_name * b # or bad_argument_name << b
end

You can't know beforehand if bad_argument_name is an array, a number or a proc/lambda. If we read this instead:

def bad_name(bad_argument_name, b)
bad_argument_name <- b
end

we would then have a clear indication that bad_argument_name is probably a proc/lambda. I know the same argument could be used to differentiate << between strings and arrays among other cases. But I think that function composition is conceptually much different from those other operations (concatenation, multiplication) than concatenation (<<) is for strings and arrays. In both cases we are concatenating but concatenation means different things for strings and arrays in non surprising ways.

But then using this arrow notation I would expect that (a <- b) would mean "a before b" (b(a(...))) while (a ∘ b) means "a after b" (a(b(...))).

I find it a bit awful to use "hypotenuse = square_sum <- sqrt", although it is the way OO usually work ([4, 5].square_num.sqrt - pseudo-code of course). But we would not be using "[4, 5].hypotenuse", but "hypotenuse.call 4, 5", right? So, since we're using procedural notation for procs/lambdas we should be thinking of procedural programming when deciding which operator to use.

I would really prefer to have lambda syntax as "double = <-{|n| n * 2}" and function composition as "hypotenuse = sqrt -> square_sum" (sqrt after square_sum). But since I don't believe the lambda syntax won't ever change, let's try to see this over a different perspective.

Instead of reading (a <- b) as "a before b", I'll try to think of it as being "b applied to a" (a(b(...))). This also make sense to me so I can easily get used to this. It would work the same way as "*" but there would be a clear indication that this refers to function composition rather than some generic multiplication algorithm.

Having said that, I'd like to confirm that I'm ok with either * or <- and I'd really like to have function composition as part of Ruby.

#21 Updated by Yusuke Endoh over 2 years ago

  • Target version changed from 2.0.0 to next minor

#22 Updated by First Last over 2 years ago

proc composition is not commutative, so the operator should:

  1. not imply commutativity
  2. not conceal the order of application

i.e. the operator should be visually asymmetrical with clear directionality

e.g. <<, <<<, <-

a << b << c = a(b(c(x)))

perhaps it also makes sense to have the other direction: c >> b >> a = a(b(c(x)))

#23 Updated by Boris Stitnicky over 2 years ago

+1 to #*.
+1 to rosenfeld's first 2 paragraphs ( h = f ∘ g as h = f * g, and matrix multiplication analogy).
-1 to "<-". Rationale: It is too easy invent a guitar with one more string. Furthermore, when it comes to operators, I consider design by jury a weak approach.

#24 Updated by Rodrigo Rosenfeld Rosas over 2 years ago

I play a 7-string guitar and I can tell you that the extra string greatly improves our possibilities and it is pretty common in Samba and Choro Brazilian music styles:

http://www.youtube.com/watch?v=3mTdpRY6yMI
http://www.youtube.com/watch?v=_FNDXcVr1Pk (here we not only have a 7-string guitar but also a 10-string bandolim while the usual one has 8 strings)

I'm not against #. I just slightly prefer "<-" over "".

#25 Updated by Alexey Muranov over 2 years ago

rits (First Last) wrote:

proc composition is not commutative, so the operator should:

  1. not imply commutativity

In algebra, multiplication is rarely commutative, see for example http://en.wikipedia.org/wiki/Quaternion or http://en.wikipedia.org/wiki/Group_(mathematics)

#26 Updated by Paul Mucur about 2 months ago

Attached patches for Proc#* and Method#* for Proc and Method composition including test cases. Also raised as a pull request on GitHub at https://github.com/ruby/ruby/pull/935

One thing that might be worth discussing is the necessity of the type checking: should it be possible to compose any callable object (rather than explicitly Procs and Methods)? The Ruby implementation suggested in the description of this issue suggests calling to_proc on the given argument but we could also demand that the supplied argument responds to call, e.g. in pure Ruby:

class Proc
  def *(g)
    proc { |*args, &blk| call(g.call(*args, &blk)) }
  end
end

vs.

class Proc
  def *(g)
    proc { |*args, &blk| call(g.to_proc.call(*args, &blk)) }
  end
end

#27 Updated by Paul Mucur about 1 month ago

Attached patch to support composing with any object that responds to call (rather than raising a TypeError if the object was not a Proc or Method), e.g.

class Foo
  def call(x, y)
    x + y
  end
end

f = proc { |x| x * 2 }
g = f * Foo.new

g.call(1, 2) #=> 6

#28 Updated by Paul Mucur about 1 month ago

Regarding the syntax: I also support * as the operator where f * g = f(g(x)) (as it seems close enough to the mathematical syntax already used by other languages such as Haskell and Idris) but if that is too divisive, we could choose a method name from the mathematical definition (https://en.wikipedia.org/wiki/Function_composition) instead:

The notation g ∘ f is read as "g circle f ", or "g round f ", or "g composed with f ", "g after f ", "g following f ", or "g of f", or "g on f ".

This opens up the following options:

  • Proc#compose: f.compose(g) #=> f(g(x))
  • Proc#after: f.after(g) #=> f(g(x))
  • Proc#following: f.following(g) #=> f(g(x))
  • Proc#of: f.of(g) #=> f(g(x))
  • Proc#on: f.on(g) #=> f(g(x))

#29 Updated by Pablo Herrero about 1 month ago

It would be nice to be able to compose functions in both ways, like in F#, you can do g << f or g >> f, sadly this was rejected before.

I would settle to have Proc#* for "regular" composition and Proc#| for "piping".
Last time there was no consensus about the syntax. Hopefully we can manage to solve this before 2.3 is released.

#30 Updated by Martin Dürst about 1 month ago

I'm teaching Haskell in a graduate class, so I'm quite familiar with function composition and use it a lot, but the original example isn't convincing at all. For me, in Ruby, something like val.strip.capitalize reads much, much better than some artificially forced function composition. If there were a method String#prepend, it would be even more natural: val.strip.capitalize.prepend('Title: ').

If there are better examples that feel more natural in Ruby, please post them here.

#31 Updated by Benoit Daloze about 1 month ago

Martin Dürst wrote:

I'm teaching Haskell in a graduate class, so I'm quite familiar with function composition and use it a lot, but the original example isn't convincing at all. For me, in Ruby, something like val.strip.capitalize reads much, much better than some artificially forced function composition. If there were a method String#prepend, it would be even more natural: val.strip.capitalize.prepend('Title: ').

If there are better examples that feel more natural in Ruby, please post them here.

There is String#prepend, but it mutates the receiver.

#32 Updated by Pablo Herrero about 1 month ago

Martin Dürst wrote:

I'm teaching Haskell in a graduate class, so I'm quite familiar with function composition and use it a lot, but the original example isn't convincing at all. For me, in Ruby, something like val.strip.capitalize reads much, much better than some artificially forced function composition. If there were a method String#prepend, it would be even more natural: val.strip.capitalize.prepend('Title: ').

If there are better examples that feel more natural in Ruby, please post them here.

I don't believe you need a pure PF language to benefit from a feature like this. Many ETL projects like transproc (https://github.com/solnic/transproc) would probably find it useful too.

#33 Updated by Paul Mucur about 1 month ago

Pablo Herrero wrote:

I don't believe you need a pure PF language to benefit from a feature like this. Many ETL projects like transproc (https://github.com/solnic/transproc) would probably find it useful too.

Transproc is what actually inspired me to submit a patch here: my hope is that having functional composition in the Ruby language itself will enable easier data pipelining using only Procs, Methods and other objects implementing call. The presence of curry (http://ruby-doc.org/core-2.2.2/Proc.html#method-i-curry) seems like a good precedent for adding such functional primitives to the core.

#34 Updated by Tom Stuart about 1 month ago

I support the proposed Proc#*/Method#* syntax and semantics.

The feature being added is function composition; not relation composition, not method chaining. Its target audience is most likely to read f * g as “f after g”, so that’s how it should work. Perhaps some Ruby programmers will not use this feature directly (as with Proc#curry) because they neither program nor think in a functional style, but it should be designed to be useful and familiar to those who do. The proposed implementation achieves that.

The asterisk isn’t ideal, but it’s the best choice available.

Also available in: Atom PDF