Feature #6284

Add composition for procs

Added by Pablo Herrero about 2 years ago. Updated over 1 year ago.

[ruby-core:44303]
Status:Feedback
Priority:Normal
Assignee:Yukihiro Matsumoto
Category:-
Target version:next minor

Description

=begin
It would be nice to be able to compose procs like functions in functional programming languages:

to_camel = :capitalize.to_proc
add_header = ->val {"Title: " + val}

format_as_title = add_header << to_camel << :strip

instead of:

format_as_title = lambda {|val| "Title: " + val.strip.capitalize }

It's pretty easy to implement in pure ruby:

class Proc
def << block
proc { |args| self.call( block.to_proc.call(args) ) }
end
end
=end

History

#1 Updated by Thomas Sawyer about 2 years ago

=begin
Or

format_as_title = ->{ |val| add_header[to_camel[val.strip]] }

=end

#2 Updated by Thomas Sawyer about 2 years ago

Also, I think #+ is better.

#3 Updated by Pablo Herrero about 2 years ago

trans (Thomas Sawyer) wrote:

Also, I think #+ is better.

I saw facets has some similar feature that uses #* instead, maybe because it looks a bit closer to Haskell's composition syntax. Nevertheless, I still like #<< better, it feels your are "connecting" the blocks together.

#4 Updated by Alexey Muranov about 2 years ago

I would vote for #*. I think #<< is usually changing the left argument (in place).

#6 Updated by Yusuke Endoh about 2 years ago

  • Status changed from Open to Assigned
  • Assignee set to Yukihiro Matsumoto

#7 Updated by Pablo Herrero about 2 years ago

=begin

aprescott (Adam Prescott) wrote:

See also: ((URL:http://web.archive.org/web/20101228224741/http://drmcawesome.com/FunctionCompositionInRuby))

Maybe #| could be a possibility. (Without implementing #> or #<).

But I find the article's proposition about the chaining order a bit missleading:

transform = add1 | sub3 | negate

For me that feels more like "piping" ((|add1|)) to ((|sub3|)) to ((|negate|)), from left to right, not the other way around.

If we choose to take that path I think the following code would be a plausible implementation:

class Proc
  def | block
    proc { |*args| block.to_proc.call( self.call(*args) ) }
  end
end

class Symbol
  def | block
    self.to_proc | block
  end
end

=end

#8 Updated by Alexey Muranov about 2 years ago

What about #* for composing traditionally (right to left) and #| for piping (left to right)? In mathematics, depending of the area and the subject, both ways are used, and some argue that "piping" is more natural than "precomposing". However, when functions are "piped", the arguments are usually on the left: (arguments)(function1 function2).

Update: i think having the both was a bad idea, it would be redundant.

#9 Updated by Thomas Sawyer about 2 years ago

I agree, #* is appropriate for composition.

#10 Updated by Pablo Herrero about 2 years ago

alexeymuranov (Alexey Muranov) wrote:

Update: i think having the both was a bad idea, it would be redundant.

I was going to say the same thing. Having both #* and #| is redundant and also a bit confusing, since #| doesn't really feel to be the opposite operation of #* at any context. We should choose one or the other but not both.
I still like #| (chaining from left to right) a bit better, but I rather have #* than nothing.

#11 Updated by Yukihiro Matsumoto over 1 year ago

  • Status changed from Assigned to Feedback

Positive about adding function composition. But we need method name consensus before adding it?
Is #* OK for everyone?

Matz.

#12 Updated by Joshua Ballanco over 1 year ago

Might I humbly suggest #<- :

to_camel = :capitalize.to_proc
add_header = ->val {"Title: " + val}

format_as_title = add_header <- to_camel <- :strip

Seems to have a nice symmetry with #->

#13 Updated by Rodrigo Rosenfeld Rosas over 1 year ago

I think "<-" reads better but I'm ok with '*' as well.

#14 Updated by Rohit Arondekar over 1 year ago

I'm with Joshua, I think #<- reads a lot better.

#15 Updated by Alexey Muranov over 1 year ago

=begin
I think that the meaning of (({#<-})) would not be symmetric with the meaning of (({#->})).

Also, in mathematics, arrows are more like relations than operations. When used to describe functions, usually function arguments go to the arrow's tail, function values to the arrow's head, and function's name, for example, goes on top of the arrow.
(In this sense Ruby's lambda syntax would look to me more natural in the form (({f = (a,b)->{ a + b }})) instead of (({f = ->(a,b){ a + b }})).)

The main drawback of #* in my opinion is that is does not specify the direction of composition ((f*g)(x) is f(g(x)) or g(f(x))?), but since in Ruby function arguments are written on the right ((({f(g(x))}))), i think it can be assumed that the inner function is on the right and the outer is on the left.

((Update)) : Just for reference, here is how it is done in Haskell : http://www.haskell.org/haskellwiki/Function_composition
=end

#16 Updated by Marc-Andre Lafortune over 1 year ago

+1 for #*

The symbol used in mathematics for function composition is a circle (∘); the arrows are for the definitions of functions (like lambdas) only, so #<- or whatever make no sense to me.

Finally, the f ∘ g(x) is defined as f(g(x)), so there is no argument there either.

#17 Updated by Martin Dürst over 1 year ago

marcandre (Marc-Andre Lafortune) wrote:

+1 for #*

The symbol used in mathematics for function composition is a circle (∘); the arrows are for the definitions of functions (like lambdas) only, so #<- or whatever make no sense to me.

Very good point.

Finally, the f ∘ g(x) is defined as f(g(x)), so there is no argument there either.

Not true. Depending on which field of mathematics you look at, either (f ∘ g)(x) is either f(g(x)), or it is g(f(x)). The later is in particular true in work involving relations, see e.g. http://en.wikipedia.org/wiki/Composition_of_relations#Definition.

Speaking from a more programming-related viewpoint, f(g(x)) is what is used e.g. in Haskell, and probably in many other functional languages, and so may be familiar with many programmers.

However, we should take into account that a functional language writes e.g. reverse(sort(array)), so it makes sense to define revsort = reverse * sort (i.e. (f ∘ g)(x) is f(g(x))). But in Ruby, it would be array.sort.reverse, so revsort = sort * reverseve may feel much more natural (i.e. (f ∘ g)(x) is g(f(x))).

#18 Updated by Matthew Kerwin over 1 year ago

I agree that (f ∘ g)(x) is g(f(x)) is more intuitive from a purely
programmatic point of view. It is "natural" for the operations to be
applied left to right, exactly like method chaining.

On 10 November 2012 13:06, duerst (Martin Dürst) duerst@it.aoyama.ac.jpwrote:

Issue #6284 has been updated by duerst (Martin Dürst).

marcandre (Marc-Andre Lafortune) wrote:

+1 for #*

The symbol used in mathematics for function composition is a circle (∘);
the arrows are for the definitions of functions (like lambdas) only, so #<-
or whatever make no sense to me.

Very good point.

Finally, the f ∘ g(x) is defined as f(g(x)), so there is no argument
there either.

Not true. Depending on which field of mathematics you look at, either (f ∘
g)(x) is either f(g(x)), or it is g(f(x)). The later is in particular true
in work involving relations, see e.g.
http://en.wikipedia.org/wiki/Composition_of_relations#Definition.

Speaking from a more programming-related viewpoint, f(g(x)) is what is
used e.g. in Haskell, and probably in many other functional languages, and
so may be familiar with many programmers.

However, we should take into account that a functional language writes
e.g. reverse(sort(array)), so it makes sense to define revsort = reverse *
sort (i.e. (f ∘ g)(x) is f(g(x))). But in Ruby, it would be
array.sort.reverse, so revsort = sort * reverseve may feel much more
natural (i.e. (f ∘ g)(x) is g(f(x))).


Feature #6284: Add composition for procs
https://bugs.ruby-lang.org/issues/6284#change-32728

Author: pabloh (Pablo Herrero)
Status: Feedback
Priority: Normal
Assignee: matz (Yukihiro Matsumoto)
Category:
Target version: 2.0.0

=begin
It would be nice to be able to compose procs like functions in functional
programming languages:

to_camel = :capitalize.to_proc
add_header = ->val {"Title: " + val}

format_as_title = add_header << to_camel << :strip

instead of:

format_as_title = lambda {|val| "Title: " + val.strip.capitalize }

It's pretty easy to implement in pure ruby:

class Proc
def << block
proc { |args| self.call( block.to_proc.call(args) ) }
end
end
=end

http://bugs.ruby-lang.org/

--
Matthew Kerwin, B.Sc (CompSci) (Hons)
http://matthew.kerwin.net.au/
ABN: 59-013-727-651

"You'll never find a programming language that frees
you from the burden of clarifying your ideas." - xkcd

#19 Updated by Alexey Muranov over 1 year ago

phluid61 (Matthew Kerwin) wrote:

I agree that (f ∘ g)(x) is g(f(x)) is more intuitive from a purely
programmatic point of view. It is "natural" for the operations to be
applied left to right, exactly like method chaining.

When functions are applied from left to right, the argument is usually (if not always) on the left. The form (x)(fg)=((x)f)g may look awkward (though i personally used it in a math paper), so i think usually the "exponential" notation is preferred: xfg = (xf)g, where xf corresponds to f(x) in the usual notation.

With method chaining, IMO, the "main argument" of a method is the receiver, and it is on the left. Lambdas and Procs are not chained in the same way as method calls.

Update: I agree that the common syntax for calling functions (f(x) rather then (x)f) should not be an obstacle if Ruby decides to consistently multiply functions putting the inner on the left and the outer on the right. Another syntax for calling functions can be invented in the future, or rubists can learn to live with this inconsistency. For example, Ruby (or Matz) can decide to multiply lambdas with the inner on the left and the outer on the right, and add the following syntax:

formatastitle = :strip.toproc * :capitalize.toproc * lambda { |val| "Title: " + val }
title = " over here " ^ formatastitle # instead of title = format_as_title.call(" over here ")

Update 2012-11-11: I was not clear what i meant by "multiplying from left to right". I meant to say: putting the inner function on the left and the outer on the right. I am correcting this phrase.

#20 Updated by Rodrigo Rosenfeld Rosas over 1 year ago

In Math multiplication is always associative, even for matrix. I.e: (AB)C == A(BC). If we use * for ∘ (composition) it resembles multiplication. Function composition is analog to matrix multiplication which are commonly used for transformation compositions as well. In fact, function composition is also associative.

So, when representing h = f ∘ g as h = f * g it makes sense to me (although Math preferring a different symbol for multiplication and composition is a good indication that we should consider this as well for Ruby - more on that later on). But Math representation is procedural, not object oriented. If we try to mix both approaches to fit Ruby philosophy this could lead to great confusion.

Ruby can be also used for procedural programming:

sqrt = ->(n){ Math.sqrt n } # Although I agree that (n)->{} would read more natural to me, just like in CoffeeScript
squaresum = ->(a, b) { aa + bb }
hypotenuse = sqrt * square
sum
5 == hypotenuse.call 3, 4 # equivalent to: sqrt.call square_sum.call 3, 4

This makes total sense to me using procedural notation. I'm not sure how would someone use this using some OO notation instead...

Now with regards to composition notation, I think a different notation could help those reading some code and trying to understand it. Suppose this method:

def badname(badargumentname, b)
bad
argumentname * b # or badargument_name << b
end

You can't know beforehand if badargumentname is an array, a number or a proc/lambda. If we read this instead:

def badname(badargumentname, b)
bad
argument_name <- b
end

we would then have a clear indication that badargumentname is probably a proc/lambda. I know the same argument could be used to differentiate << between strings and arrays among other cases. But I think that function composition is conceptually much different from those other operations (concatenation, multiplication) than concatenation (<<) is for strings and arrays. In both cases we are concatenating but concatenation means different things for strings and arrays in non surprising ways.

But then using this arrow notation I would expect that (a <- b) would mean "a before b" (b(a(...))) while (a ∘ b) means "a after b" (a(b(...))).

I find it a bit awful to use "hypotenuse = squaresum <- sqrt", although it is the way OO usually work ([4, 5].squarenum.sqrt - pseudo-code of course). But we would not be using "[4, 5].hypotenuse", but "hypotenuse.call 4, 5", right? So, since we're using procedural notation for procs/lambdas we should be thinking of procedural programming when deciding which operator to use.

I would really prefer to have lambda syntax as "double = <-{|n| n * 2}" and function composition as "hypotenuse = sqrt -> squaresum" (sqrt after squaresum). But since I don't believe the lambda syntax won't ever change, let's try to see this over a different perspective.

Instead of reading (a <- b) as "a before b", I'll try to think of it as being "b applied to a" (a(b(...))). This also make sense to me so I can easily get used to this. It would work the same way as "*" but there would be a clear indication that this refers to function composition rather than some generic multiplication algorithm.

Having said that, I'd like to confirm that I'm ok with either * or <- and I'd really like to have function composition as part of Ruby.

#21 Updated by Yusuke Endoh over 1 year ago

  • Target version changed from 2.0.0 to next minor

#22 Updated by First Last over 1 year ago

proc composition is not commutative, so the operator should:

  1. not imply commutativity
  2. not conceal the order of application

i.e. the operator should be visually asymmetrical with clear directionality

e.g. <<, <<<, <-

a << b << c = a(b(c(x)))

perhaps it also makes sense to have the other direction: c >> b >> a = a(b(c(x)))

#23 Updated by Boris Stitnicky over 1 year ago

+1 to #*.
+1 to rosenfeld's first 2 paragraphs ( h = f ∘ g as h = f * g, and matrix multiplication analogy).
-1 to "<-". Rationale: It is too easy invent a guitar with one more string. Furthermore, when it comes to operators, I consider design by jury a weak approach.

#24 Updated by Rodrigo Rosenfeld Rosas over 1 year ago

I play a 7-string guitar and I can tell you that the extra string greatly improves our possibilities and it is pretty common in Samba and Choro Brazilian music styles:

http://www.youtube.com/watch?v=3mTdpRY6yMI
http://www.youtube.com/watch?v=_FNDXcVr1Pk (here we not only have a 7-string guitar but also a 10-string bandolim while the usual one has 8 strings)

I'm not against #. I just slightly prefer "<-" over "".

#25 Updated by Alexey Muranov over 1 year ago

rits (First Last) wrote:

proc composition is not commutative, so the operator should:

  1. not imply commutativity

In algebra, multiplication is rarely commutative, see for example http://en.wikipedia.org/wiki/Quaternion or http://en.wikipedia.org/wiki/Group_(mathematics)

Also available in: Atom PDF