Project

General

Profile

Actions

Feature #20770

open

A *new* pipe operator proposal

Added by AlexandreMagro (Alexandre Magro) 3 months ago. Updated 14 days ago.

Status:
Open
Assignee:
-
Target version:
-
[ruby-core:119335]

Description

Hello,

This is my first contribution here. I have seen previous discussions around introducing a pipe operator, but it seems the community didn't reach a consensus. I would like to revisit this idea with a simpler approach, more of a syntactic sugar that aligns with how other languages implement the pipe operator, but without making significant changes to Ruby's syntax.

Currently, we often write code like this:

value = half(square(add(value, 3)))

We can achieve the same result using the then method:

value = value.then { add(_1, 3) }.then { square(_1) }.then { half(_1) }

While then helps with readability, we can simplify it further using the proposed pipe operator:

value = add(value, 3) |> square(_1) |> half(_1)

Moreover, with the upcoming it feature in Ruby 3.4 (#18980), the code could look even cleaner:

value = add(value, 3) |> square(it) |> half(it)

This proposal uses the anonymous block argument (_1), and with it, it simplifies the code without introducing complex syntax changes. It would allow us to achieve the same results as in other languages that support pipe operators, but in a way that feels natural to Ruby, using existing constructs like then underneath.

I believe this operator would enhance code readability and maintainability, especially in cases where multiple operations are chained together.

Thank you for considering this proposal!

Updated by nobu (Nobuyoshi Nakada) 2 months ago

  • Tracker changed from Bug to Feature
  • ruby -v deleted (3.3.5)
  • Backport deleted (3.1: UNKNOWN, 3.2: UNKNOWN, 3.3: UNKNOWN)

In the previous trial syntax, the receiver of RHS was the result of LHS.

In your proposal, the receiver of RHS is the same as LHS, and the LHS result is passed as an implicit argument?

Updated by AlexandreMagro (Alexandre Magro) 2 months ago

nobu (Nobuyoshi Nakada) wrote in #note-1:

In the previous trial syntax, the receiver of RHS was the result of LHS.

In your proposal, the receiver of RHS is the same as LHS, and the LHS result is passed as an implicit argument?

Exactly, this is the expected behavior of the pipe operator in other functional languages, such as Elixir. In those languages, the left-hand side (LHS) value is passed directly as an argument to the function on the right-hand side (RHS), either as the first or last argument depending on the language. For example, in Elixir, you might write:

value = value |> add(3) |> square() |> half()

My proposal for Ruby offers a more flexible approach. The LHS value can be passed as an explicit argument (using _1 or it), allowing for greater control over how the RHS function handles the received value.

Additionally, this approach simplifies the implementation by treating RHS as executable block, just as we already do with .then.

Updated by shuber (Sean Huber) 2 months ago

I would still love to see this type of pipeline functionality implemented with plain expressions instead of new operators.

I have this (old) working proof of concept gem from years ago (basic syntax described below) but it was primarily focused on constant interception. I imagine it can be quite a bit more complex adding support for calling Proc objects and other edge cases.

"https://api.github.com/repos/ruby/ruby".pipe do
  URI.parse
  Net::HTTP.get
  JSON.parse.fetch("stargazers_count")
  yield_self { |n| "Ruby has #{n} stars" }
  Kernel.puts
end
#=> Ruby has 22120 stars

-9.pipe { abs; Math.sqrt; to_i } #=> 3

[9, 64].map(&Math.pipe.sqrt.to_i.to_s) #=> ["3", "8"]

Most of the logic in that proof of concept was related to intercepting method calls to ALL constants which wouldn't be necessary if it was a core part of the language. The actual "pipeline" functionality (PipeOperator::Pipe and PipeOperator::Closure) is pretty simple - basically just keeping an array of constant+method+args calls and reduceing the result when the pipeline ends.

The proof of concept is basically prepending a version of every method in every constant with something like the example below in order to support this "pipeline expressions" syntax:

define_method(method) do |*args, &block|
  if Pipe.open
    Pipe.new(self).__send__(method, *args, &block)
  else
    super(*args, &block)
  end
end

https://github.com/lendinghome/pipe_operator#-pipe_operator

Updated by bkuhlmann (Brooke Kuhlmann) 2 months ago

For background, this has been discussed before:

  • 15799: This was implemented and then reverted.
  • 20580: This recently popped up as well.
  • There are probably other issues that I'm forgetting about that have been logged on this subject.

Introducing |> as an operator that works like #then would be interesting and would be similar to how Elixir works, as Alexandre mentioned. This is also how Elm works where you can elegantly use |> or <| as mentioned in the Operators documentation.

I also use something similar to how Sean uses a #pipe method with a block but mostly by refining the Symbol class as documented here in my Refinements gem.

Also, similar to what Sean is describing, I provide the ability to pipe commands together without using |> by using my Pipeable gem which builds upon native function composition to nice effect. Here's a snippet:

pipe data,
     check(/Book.+Price/, :match?),
     :parse,
     map { |item| "#{item[:book]}: #{item[:price]}" }

In both cases (refining Symbol or using Pipeable), the solution works great and provides and implements what is described here using different solutions. All solutions are fairly performant but would be neat if the performance could be improved further if there was a way to optimize these solutions natively in Ruby.

Updated by AlexandreMagro (Alexandre Magro) 2 months ago

bkuhlmann (Brooke Kuhlmann) wrote in #note-4:

For background, this has been discussed before:

  • 15799: This was implemented and then reverted.
  • 20580: This recently popped up as well.
  • There are probably other issues that I'm forgetting about that have been logged on this subject.

Introducing |> as an operator that works like #then would be interesting and would be similar to how Elixir works, as Alexandre mentioned. This is also how Elm works where you can elegantly use |> or <| as mentioned in the Operators documentation.

I also use something similar to how Sean uses a #pipe method with a block but mostly by refining the Symbol class as documented here in my Refinements gem.

Also, similar to what Sean is describing, I provide the ability to pipe commands together without using |> by using my Pipeable gem which builds upon native function composition to nice effect. Here's a snippet:

pipe data,
     check(/Book.+Price/, :match?),
     :parse,
     map { |item| "#{item[:book]}: #{item[:price]}" }

In both cases (refining Symbol or using Pipeable), the solution works great and provides and implements what is described here using different solutions. All solutions are fairly performant but would be neat if the performance could be improved further if there was a way to optimize these solutions natively in Ruby.

One issue with .pipe is that it mixes two approaches: the object method chain (lhs.rhs) and passing the result as an argument (rhs(lhs)). This inconsistency can be a bit confusing because it shifts between the two styles, making it harder to follow the flow.

in the .pipe version:

"https://api.github.com/repos/ruby/ruby".pipe do
  URI.parse
  Net::HTTP.get
  JSON.parse.fetch("stargazers_count")
  yield_self { |n| "Ruby has #{n} stars" }
  Kernel.puts
end

With a pipe operator, we can achieve the same result in a more consistent and readable way:

"https://api.github.com/repos/ruby/ruby"
  |> URI.parse(it)
  |> Net::HTTP.get(it)
  |> JSON.parse(it).fetch("stargazers_count")
  |> puts "Ruby has #{_1} stars"

This keeps the flow of passing the result from one step to the next clear and consistent, making the code easier to read and maintain. The pipe operator doesn’t add any extra complexity to method calls and provides more flexibility regarding how the "piped" value is used, making it feel more natural in the Ruby syntax.

Updated by vo.x (Vit Ondruch) 2 months ago · Edited

Code like add(value, 3) is hardly some idiomatic Ruby. If it was Ruby, then you'd likely use value.add(3) or value + 3. Other examples of readable code are here. I can't see what is readable about the new operator.

Also, I'd say that Math module is bad example in general, because it seems to be influenced by commonly used math notation. But arguably, having something like Math::PI.cos or 3.14.cos would be quite natural for Ruby.

Updated by AlexandreMagro (Alexandre Magro) 2 months ago

vo.x (Vit Ondruch) wrote in #note-6:

Code like add(value, 3) is hardly some idiomatic Ruby. If it was Ruby, then you'd likely use value.add(3) or value + 3. Other examples of readable code are here. I can't see what is readable about the new operator.

Also, I'd say that Math module is bad example in general, because it seems to be influenced by commonly used math notation. But arguably, having something like Math::PI.cos or 3.14.cos would be quite natural for Ruby.

I believe there’s a misunderstanding here. The example add(value, 3) is not intended to represent an idiomatic Ruby expression, like value + 3. Rather, it illustrates how a method call that modifies or processes a value would work within a pipeline.

Using the pipe operator is helpful for showing the order of executions. For example, if you want to execute a function f followed by g, you could write:

g(f(x))

However, it's easier to follow the order of executions (e.g., f and then g) when written like this:

x |> f |> g

In real-world scenarios, especially when working with APIs or complex transformations, it's common to prepare data step by step before reaching the final function. Instead of using intermediate variables, which might only be used once, the pipe operator offers a clearer and more efficient solution. For instance, consider fetching and processing data from a client API:

response = URI.parse(client_api_url)
response = Net::HTTP.get(response)
response = JSON.parse(response).fetch("client_data")
puts "Client info: #{response}"

With the pipe operator, the same logic can be simplified and made more readable:

client_api_url
  |> URI.parse(it)
  |> Net::HTTP.get(it)
  |> JSON.parse(it).fetch(important_key)

This approach not only avoids unnecessary variables but also makes the flow of data through the pipeline much clearer. The pipe operator simplifies this pattern and ensures readability, without adding complexity to method calls. It also provides flexibility in how the "passed" value is used throughout the steps.

Again, these are simplified examples of real-world problems, where the pipe operator can help streamline and clarify otherwise convoluted method chains.

Updated by ufuk (Ufuk Kayserilioglu) 2 months ago

AlexandreMagro (Alexandre Magro) wrote in #note-7:

With the pipe operator, the same logic can be simplified and made more readable:

client_api_url
  |> URI.parse(it)
  |> Net::HTTP.get(it)
  |> JSON.parse(it).fetch(important_key)

I would like to note that this almost works already today:

irb> client_api_url = "https://jsonplaceholder.typicode.com/posts/1"
#=> "https://jsonplaceholder.typicode.com/posts/1"

irb> pipeline = URI.method(:parse) >> Net::HTTP.method(:get) >> JSON.method(:parse)
#=> #<Proc:0x000000012c62b4e8 (lambda)>

irb> pipeline.call(client_api_url)
#=>
{"userId"=>1,
 "id"=>1,
 "title"=>"sunt aut facere repellat provident occaecati excepturi optio reprehenderit",
 "body"=>
  "quia et suscipit\nsuscipit recusandae consequuntur expedita et cum\nreprehenderit molestiae ut ut quas totam\nnostrum rerum est autem sunt rem eveniet architecto"}

irb> pipeline = URI.method(:parse) >> Net::HTTP.method(:get) >> JSON.method(:parse) >> -> { it.fetch("title") }
#=> #<Proc:0x000000012c4c2778 (lambda)>

irb> pipeline.call(client_api_url)
#=> "sunt aut facere repellat provident occaecati excepturi optio reprehenderit"

You can also make the whole pipeline with just using procs:

(-> { URI.parse(it) } >> -> { Net::HTTP.get(it) } >> -> { JSON.parse(it) } >> -> { it.fetch("title") }).call(client_api_url)
#=> "sunt aut facere repellat provident occaecati excepturi optio reprehenderit"

which is much closer to the syntax that you want, except for the lambda wrappers.

I think with Proc#>> and Proc#<< this need for chaining is mostly in place already. The thing that is really missing is the ability to access a method by name without having to do .method(:name) which was proposed in https://bugs.ruby-lang.org/issues/16264. That proposal would make the first example be:

(URI.:parse >> Net::HTTP.:get >> JSON.:parse >> -> { it.fetch("title") }).call(client_api_url)
#=> "sunt aut facere repellat provident occaecati excepturi optio reprehenderit"

which looks much nicer.

Updated by AlexandreMagro (Alexandre Magro) 2 months ago

ufuk (Ufuk Kayserilioglu) wrote in #note-8:

You can also make the whole pipeline with just using procs:

(-> { URI.parse(it) } >> -> { Net::HTTP.get(it) } >> -> { JSON.parse(it) } >> -> { it.fetch("title") }).call(client_api_url)
#=> "sunt aut facere repellat provident occaecati excepturi optio reprehenderit"

Yes, and it's also possible to achieve this with a chain of .then, which results in a similar structure. The idea of the pipe operator is to be syntactic sugar, bringing functionality from functional languages into Ruby without introducing any complexity, while maintaining ruby's simplicity.

client_api_url
  .then { URI.parse(it) }
  .then { Net::HTTP.get(it) }
  .then { JSON.parse(it).fetch(important_key) }

Updated by jeremyevans0 (Jeremy Evans) 2 months ago

AlexandreMagro (Alexandre Magro) wrote in #note-9:

Yes, and it's also possible to achieve this with a chain of .then, which results in a similar structure. The idea of the pipe operator is to be syntactic sugar, bringing functionality from functional languages into Ruby without introducing any complexity, while maintaining ruby's simplicity.

client_api_url
  .then { URI.parse(it) }
  .then { Net::HTTP.get(it) }
  .then { JSON.parse(it).fetch(important_key) }

We could expand the syntax to treat .{} as .then{}, similar to how .() is .call(). With that, you could do:

client_api_url
  .{ URI.parse(it) }
  .{ Net::HTTP.get(it) }
  .{ JSON.parse(it).fetch(important_key) }

Which is almost as low of a syntatic overhead as you would want.

Note that we are still in a syntax moratorium, so it's probably better to wait until after that is over and we have crowned the one true parser before seriously considering new syntax.

Updated by AlexandreMagro (Alexandre Magro) 2 months ago

jeremyevans0 (Jeremy Evans) wrote in #note-10:

We could expand the syntax to treat .{} as .then{}, similar to how .() is .call(). With that, you could do:

client_api_url
  .{ URI.parse(it) }
  .{ Net::HTTP.get(it) }
  .{ JSON.parse(it).fetch(important_key) }

Which is almost as low of a syntatic overhead as you would want.

Note that we are still in a syntax moratorium, so it's probably better to wait until after that is over and we have crowned the one true parser before seriously considering new syntax.

The idea of using .{} is really creative, but it feels somewhat unintuitive. On the other hand, the pipe operator is a well-established concept, which would ease adoption.

Updated by mame (Yusuke Endoh) 2 months ago

When pipeline operator was proposed previously (#15799), we briefly spoke of the idea of a block notation without a closing bracket (the meeting log).

For example,

add(value, 3).then do |x|> square(x)

is interpreted as:

add(value, 3).then {|x| square(x) }

However, this notation is a bit outlandish, so it was never taken very seriously.

Reconsidering it with the notation proposed in this ticket:

add(value, 3).then |> square(it).then |> half(it)

is handled as:

add(value, 3).then { square(it).then { half(it) } } # Or:
add(value, 3).then { square(it) }.then { half(it) } # depending on the associativity of |>. I am not sure which is better

It might be a good idea that we specialize this notation only for a block that is so simple that we don't need to name the parameters.

But personally, I also feel that:

value = add(value, 3)
value = square(value)
value = half(value)

is good enough.

Updated by vo.x (Vit Ondruch) 2 months ago

AlexandreMagro (Alexandre Magro) wrote in #note-7:

To me it just demonstrates that the APIs are likely incomplete and don't provide methods for easy conversion. We have a lot of conversion methods such as #to_str, #to_json, ... But there is no implicit transition from having e.g. String object to URI. I'd rather see something like client_api_url.to(URI) which could be equivalent of URI(client_api_url).

I also like the example provided by @ufuk (Ufuk Kayserilioglu)

Updated by vo.x (Vit Ondruch) 2 months ago

Not mentioning, the example ignores error handling, which would be IMHO the biggest problem in real life example

Updated by zverok (Victor Shepelev) 2 months ago

We could expand the syntax to treat .{} as .then{}, similar to how .() is .call().

I really like this idea. Yes, it is not “how it is in other languages” yet it has a deep internal consistency with other language elements and easy to understand—both for people and to automatic analysis tools, with no ambiguities about what’s allowed as a step of such “pipeline” and what’s not, what’s the scope of used names, where the expression ends and so on.

This is awesome, actually.

Updated by zverok (Victor Shepelev) 2 months ago

vo.x (Vit Ondruch) wrote in #note-14:

AlexandreMagro (Alexandre Magro) wrote in #note-7:

To me it just demonstrates that the APIs are likely incomplete and don't provide methods for easy conversion. We have a lot of conversion methods such as #to_str, #to_json, ... But there is no implicit transition from having e.g. String object to URI. I'd rather see something like client_api_url.to(URI) which could be equivalent of URI(client_api_url).

I don’t think it is realistic, generally. I mean, convert every f(g(x)) to “x should have method g, and the result should have method f, so you can write x.g.f always (or in most widespread situations)”.

Many possible cases can be argued about, but 1) the argument would not necessarily demonstrate that API change is reasonable, and 2) even when reasonable, it is not always possible.

Say, if we take the sequence that is mentioned several times already (string → URL → HTTP get → JSON parse), then both concerns apply:

  1. String#to_url (or String#to(URL) might be reasonable; #HTTPResponse#parse_json... maybe too; but URL#http_get?.. Not everybody would agree.
  2. Even if agreeing on adding all those methods in principle, what about using a different HTTP library or a different JSON parser, what’s the answer would be?.. Something like URL#http_get(with: Typhoeus) or URL#typhoeus_get added for every library? Adding local refinements to handle that depending on the library? What if the HTTP library used depends on dynamic parameters?..

So, while I agree that many APIs in Ruby have an intuition of the “object at hand has all the methods you need for the next step”, in large realistic codebases, it is not so (both technically and ideologically), and then { DifferentDomain.handle(it) } is a very widespread way to mitigate that.

Updated by vo.x (Vit Ondruch) 2 months ago

zverok (Victor Shepelev) wrote in #note-17:

I don’t think it is realistic, generally. I mean, convert every f(g(x)) to “x should have method g, and the result should have method f, so you can write x.g.f always (or in most widespread situations)”.

Right, this was far fetched and would not work admittedly. But that is why I proposed the client_api_url.to(URI), because after all, this is IMHO mostly about type conversion. Why would I ever want to call something like URI.parse(it)? Why would I need to know there is parse method and why would I need to put it / _1 multiple times everywhere and every time in different context.

Updated by AlexandreMagro (Alexandre Magro) 2 months ago

vo.x (Vit Ondruch) wrote in #note-18:

Right, this was far fetched and would not work admittedly. But that is why I proposed the client_api_url.to(URI), because after all, this is IMHO mostly about type conversion. Why would I ever want to call something like URI.parse(it)? Why would I need to know there is parse method and why would I need to put it / _1 multiple times everywhere and every time in different context.

Zverok was precise in his comment.

I understand your point, but the idea of to(URI) introduces an inversion of responsibility, which can lead to dependency inversion issues — a poor practice in software design, especially when working with different libraries.

It's unclear what you mean by client_api_url in this context since, in my example, it was simply a string. Having a .to method on a string seems generic and nonsensical.

As for the question "Why would I ever want to call something like URI.parse(it)?", code is already written this way. The pipe operator doesn’t change the syntax but rather inverts the reading flow.

Lastly, the pipe operator is a well-established concept that aims to streamline existing Ruby syntax, not alter it.

client_api_url
  |> URI.parse(it)
  |> Net::HTTP.get(it)
  |> JSON.parse(it).fetch(important_key)

This is so clean. It's just Ruby.

Updated by Dan0042 (Daniel DeLorme) 2 months ago

I'm not a big fan of this pipe operator idea, but at least the idea of using it is a good one; it solves many problems with previous proposals.

foo = 42
1 |> foo |> BAR
#foo should be localvar but somehow is parsed as method here?
#BAR should be constant but somehow is parsed as method here?
              
1 |> foo(it) |> BAR(it)
#at least foo and BAR are recognizably methods

1 |> foo(it, 2)
2 |> foo(1, it)
hash |> BAR(**it)
#also, it allows flexibility in how the argument is passed

But that being said, this doesn't seem to be so useful to me. If we compare "before" and "after" the pipe operator:

#current
client_api_url
  .then{ URI.parse(it) }
  .then{ Net::HTTP.get(it) }
  .then{ JSON.parse(it).fetch(important_key) }

#with |> syntax sugar
client_api_url
  |> URI.parse(it) 
  |> Net::HTTP.get(it) 
  |> JSON.parse(it).fetch(important_key) 

It really doesn't seem to me that readability is increased in any meaningful way. The benefit seems way too low to justify adding new syntax.

Languages with the pipe operator all have first-class functions (afaik); the two kinda go together. But Ruby doesn't have first-class functions so the usefulness of the pipe operator will inevitably be very limited.

If the pipe operator is introduced I think it should behave similarly to other languages, where the RHS is a callable object. In fact if we define the pipe operator as invoking #call or #bind_call on the RHS, I could see the beginning of a feature that is more useful than just syntax sugar.

str |> JSON.method(:parse)
1 |> Object.instance_method(:to_s) #=> "#<Integer:0x0000000000000003>"

#and now we just need nice shorthands for Mod.method(:name) and Mod.instance_method(:name)  ;-)

Updated by ufuk (Ufuk Kayserilioglu) 2 months ago

I tend to agree with @Dan0042 (Daniel DeLorme) on this one, this seems to go against the nature of Ruby. In Ruby, an expression like URI.parse(it) is always eagerly evaluated, except when it is inside a block. This is not true in other languages; ones that make a distinction between Foo.bar and Foo.bar(), for example. This proposal, however, is adding a new conceptual context in which the evaluation would be delayed, which would be in a sequence of pipeline operators. I am not sure if I like that, to be honest.

In contrast, I like @jeremyevans0 (Jeremy Evans) 's suggestion to add syntactic sugar to .then method in the form of .{} which still keeps the block as the only construct that would delay the evaluation of methods, and it allows the use of numbered block parameters and/or it inside such blocks without any other changes to the language.

Updated by austin (Austin Ziegler) 2 months ago

I think that this is one of the more interesting approaches to a pipeline operator in Ruby as it is just syntax sugar. As I am understanding it:

foo
|> bar(_1, baz)
|> hoge(_1, quux)

would be treated by the parser to be the same as:

foo
  .then { bar(_1, baz) }
  .then { hoge(_1, quux) }

It would be nice (given that there syntax sugaring happening here) that if it or _1 is missing, it is implicitly inserted as the first parameter:

foo
|> bar(baz)
|> hoge(quux)

  ==

foo
  .then { bar(_1, baz) }
  .then { hoge(_1, quux) }

This would enable the use of callables (procs and un/bound methods) as suggested by @Dan0042 (Daniel DeLorme) in #note-20.

I am not sure that without that implicit first parameter, the potential confusion introduced by the differently-shaped blocks is worthwhile. Regardless, as someone who maintains libraries that with deep compatibility, I won't be able to use this in those for another decade at least (I still haven't released versions of my most used libraries that are 3.x only), by which time I am hoping to have found someone else to maintain them.

vo.x (Vit Ondruch) wrote in #note-18:

[the pipe operator] is IMHO mostly about type conversion

Having used Elixir heavily for the last seven years, I do not agree with this description. It can be, and the examples in question might be, but it's used equally in transformation (type conversion) and in context passing. Plug (more or less the Elixir equivalent to Rack) is composable because the first parameter to every plug function (whether a function/2 or a module with init/1 and call/2) is a Plug.Conn struct, allowing code like this:

def call(conn, %Config{} = config) do
  {metadata, span_context} =
    start_span(:plug, %{conn: conn, options: Config.telemetry_context(config)})

  conn =
    register_before_send(conn, fn conn ->
      stop_span(span_context, Map.put(metadata, :conn, conn))
      conn
    end)

  results =
    conn
    |> verify_request_headers(config)
    |> Map.new()

  conn
  |> put_private(config.name, results)
  |> dispatch_results(config)
  |> dispatch_on_resolution(config.on_resolution)
end

This is no different than:

def call(conn, %Config{} = config) do
  {metadata, span_context} =
    start_span(:plug, %{conn: conn, options: Config.telemetry_context(config)})

  conn =
    register_before_send(conn, fn conn ->
      stop_span(span_context, Map.put(metadata, :conn, conn))
      conn
    end)

  results = verify_request_headers(conn, config)
  results = Map.new(results)

  conn = put_private(conn, config.name, results)
  conn = dispatch_results(conn, config)
  dispatch_on_resolution(conn, config.on_resolution)
end

I find the former much more readable, because it's more data oriented and indicates that the data flows through the pipe — where it might be transformed (conn |> verify_request_headers(…) |> Map.new()) or it might just be modifying the input parameter (conn |> put_private(…) |> dispatch_results(…) |> dispatch_on_resolution(…)).

jeremyevans0 (Jeremy Evans) wrote in #note-10:

We could expand the syntax to treat .{} as .then{}, similar to how .() is .call(). With that, you could do:

client_api_url
  .{ URI.parse(it) }
  .{ Net::HTTP.get(it) }
  .{ JSON.parse(it).fetch(important_key) }

Which is almost as low of a syntatic overhead as you would want.

Note that we are still in a syntax moratorium, so it's probably better to wait until after that is over and we have crowned the one true parser before seriously considering new syntax.

This is … interesting. The biggest problem with it (from my perspective) is that it would privilege {} blocks with this form, because do is a valid method name, so .do URI.parse(it) end likely be a syntax error. That and the fact that it would be nearly a decade before it could be used by my libraries.

Updated by AlexandreMagro (Alexandre Magro) 2 months ago

ufuk (Ufuk Kayserilioglu) wrote in #note-21:

I tend to agree with @Dan0042 (Daniel DeLorme) on this one, this seems to go against the nature of Ruby. In Ruby, an expression like URI.parse(it) is always eagerly evaluated, except when it is inside a block. This is not true in other languages; ones that make a distinction between Foo.bar and Foo.bar(), for example. This proposal, however, is adding a new conceptual context in which the evaluation would be delayed, which would be in a sequence of pipeline operators. I am not sure if I like that, to be honest.

Actually, with the pipe operator, URI.parse(it) is also inside a block, but the block is implicit.

The block spans from the pipe operator itself to the next pipe operator or a new line, making it simpler and more concise without changing the evaluation flow.

Updated by Eregon (Benoit Daloze) 2 months ago

One concern with so many then {} is that's a non-trivial overhead for execution (2 method calls + 1 block call for then { foo(it) } vs 1 method call for foo(var)).
So if it's added I think it should translate to the same as using local variables and not then {} blocks.

I would write that snippet like this:

json = Net::HTTP.get(URI.parse(client_api_url))
JSON.parse(json).fetch(important_key) 

2 lines of code vs 4, and IMO just as readable if not better.
So in my opinion there is no need for a pipeline operator for this.

Also I would think in real code one would probably want to rescue some exceptions there, and so the pipeline wouldn't gain much visually and might need to be broken down in several parts anyway.

Updated by zverok (Victor Shepelev) 2 months ago

@Eregon (Benoit Daloze) this example (at least for me) is just an easy target for discussion (because it uses standard libraries, is easily reproducible, and demonstrates the multi-step realistic process that uses several libraries at once).

I believe the point here is not “how it could be rewritten in non-flow-style,” but rather “many people in many codebases find flow-style useful, should we have a syntax sugar for it?”

I can confirm that for me (and many colleagues who were exposed to this style), it seems a more convenient way, especially to structure business code or quick sketching. It also might have a positive effect on overall algorithm structuring: the code author starts to think in “sequence of steps” terms, and (again, especially in complicated business code developed rapidly) it provides some protection against messy methods, where many local variables are calculated, and soon it is hard to tell which of them related to which of the next steps and how many flows are there.

I think it is also very natural to Ruby, considering one of the things we have different than many other languages is Enumerable as the center cycle structure, which supports chains of sequence transformations... So, then is just a chain of singular value transformations.

But I think it is not necessary to prefer this style yourself to acknowledge others find it useful. (Well, alternatively, it could be a discussion like “nobody should do that, it shouldn’t be preferred/supported style,” but that’s another discussion.)

Updated by eightbitraptor (Matt V-H) 2 months ago · Edited

The Ruby-lang homepage states that Ruby has

a focus on simplicity and productivity. It has an elegant syntax that is natural to read and easy to write.

And on the about page:

Ruby often uses very limited punctuation and usually prefers English keywords, some punctuation is used to decorate Ruby.

In my opinion this proposal conflicts with this description because:

  1. |> is less natural to read than the English word then. then has a clear and unambiguous meaning, |> is an arbitrary combination of symbols that developers need to learn.
  2. |> masks complexity - requiring users to learn and remember knowledge that could be easily read from the source code.

I don't understand, from reading this discussion, what benefit we would gain from writing the proposed:

client_api_url
  |> URI.parse(it)
  |> Net::HTTP.get(it)
  |> JSON.parse(it).fetch(important_key)

especially when, as has already been pointed out, we can do this in the current version:

client_api_url
  .then { URI.parse(it) }
  .then { Net::HTTP.get(it) }
  .then { JSON.parse(it).fetch(important_key) }

which is arguably more readable, and more intention revealing (for those of us unfamiliar with this Elixir).

Lastly

bringing functionality from functional languages into Ruby without introducing any complexity, while maintaining ruby's simplicity.

This isn't importing functionality from other languages, merely syntax. I'm against adopting syntax if there isn't a clear (and preferable measurable) benefit to the Ruby ecosystem.

Updated by AlexandreMagro (Alexandre Magro) 2 months ago

I strongly agree that new additions should be thoroughly evaluated and aligned with the philosophy of the language ("A programmer's best friend"). I've found the discussion so far to be very productive, and my opinion is that:

I don't see |> as "an arbitrary combination of symbols". I believe the pipe operator is a well-established concept, predating Ruby itself, and symbolic usage to express certain expressions is already present in the language, such as &:method_name instead of { |x| x.method_name }.

Updated by zverok (Victor Shepelev) 2 months ago

A couple of my counterpoints to |> (and towards .{}, if we do need syntax sugar in this place at all):

While |> sure exists in other languages, we need to look into how it plays with the rest of the code/semantics of our language (because in languages where it exists, it is typically supported by many small and large semantical facts).

Say, in Elixir, one might write this (not precise code, writing kind-of pseudocode from the top of my head):

row
  |> String.split('|')
  |> Enumerable.map(fn x -> parse(x) end)
  |> Enumerable.filter(&Number.odd?)
  |> MyModule.process_numbers
  |> String.join('-')

In Ruby, the equivalent would be mostly with “current object’s methods”, as @vo.x (Vit Ondruch) notes, with .then occasionally compensating when you need to use another module:

row
  .split('|')
  .map { parse(it) }
  .filter(&:odd?)
  .then { MyModule.process_numbers(it) }
  .join('-')

What would |> bring here?

row
  .split('|')
  .map { parse(it) }
  .filter(&:odd?)
  |> MyModule.process_numbers(it)
  .join('-')

In my view, only syntactical/semantic confusion (what’s the scope in |> line? is join attached to its result, or is it inside the „invisible block”?.. Why do we have a fancy symbol for .then, but not for map or filter, which are arguably even more widespread?..)

Every time the topic arises, I am confused about it the same way. It seems like just chasing “what others have,” without much strong argument other than “but others do it this way.” But I might really miss something here.

Updated by shuber (Sean Huber) 2 months ago · Edited

I agree with @zverok (Victor Shepelev) and am not quite sold on the value of |> over the existing .then{} if we still have to explicitly specify implicit args like it/_1/etc (unlike elixir).

I am intrigued by the .{} syntax though but wish it did more than behave as an alias for .then{}.

What if .{} behaved more like this elixir-style syntax without implicit args?

# existing ruby syntax

url
  .then { URI.parse(it) }
  .then { Net::HTTP.get(it) }
  .then { JSON.parse(it).fetch_values("some", "keys") }
  .then { JSON.pretty_generate(it, allow_nan: false) }
  .then { Example.with_non_default_arg_positioning(other_object, it) }

# proposed ruby syntax

url
  .{ URI.parse }
  .{ Net::HTTP.get }
  .{ JSON.parse.fetch_values("some", "keys") }
  .{ JSON.pretty_generate(allow_nan: false) }
  .{ Example.with_non_default_arg_positioning(other_object, self) }

# one line chaining example
"-9".abs.{Math.sqrt}.to_i.to_s #=> "3"

# maybe support to_proc as well
[9].map(&{Math.sqrt.to_i.to_s}) #=> ["3"]

Updated by AlexandreMagro (Alexandre Magro) 2 months ago · Edited

zverok (Victor Shepelev) wrote in #note-28:

What would |> bring here?

row
  .split('|')
  .map { parse(it) }
  .filter(&:odd?)
  |> MyModule.process_numbers(it)
  .join('-')

In my view, only syntactical/semantic confusion (what’s the scope in |> line? is join attached to its result, or is it inside the „invisible block”?.. Why do we have a fancy symbol for .then, but not for map or filter, which are arguably even more widespread?..)

I’d like to turn the question around and ask what would be returned from the following code?

array_a = [{ name: 'A', points: 30 }, { name: 'B', points: 20 }, { name: 'C', points: 10 }]
array_b = [{ name: 'D', points: 0 }, { name: 'E', points: 0 }]

array_c = array_a
  .sort { |a, b| b[:points] <=> a[:points] }
  + array_b
  .map { |el| el[:name] }

This highlights that mixing operators and methods within a chain can indeed create confusion. The example is tricky because it's not clear if the .map will apply to array_b or to array_a after it has been sorted and concatenated with array_b.

In the same way, the |> operator might introduce confusion if it's mixed in with method chains without proper context. However, just like +, |> is simply another operator. It can be understood like:

  • a |> b translates to something like ->(a) { b }.
  • Similarly, a + b is ->(a, b) { a + b }.

In both your example and mine, the operators (|> and +) could simply be replaced with appropriate methods (then and concat, respectively), depending on the context and desired functionality.

Updated by zverok (Victor Shepelev) 2 months ago

@AlexandreMagro (Alexandre Magro) I don’t think this analogy is suitable here.

Of course, there are operators that aren’t convenient to use in chaining (though, I should admit to the sin of sometimes just using the.chain.with.+(argument).like.that, and it works and follows the existing Ruby semantics and intuitions, even if not to everybody’s liking).

But my point was that the proposed construct is specifically for easier chaining but doesn’t fall in line with any other Ruby’s tool for that. I think a comparison with Elixir demonstrates that.

In Elixir, you’ll say, “see, whatever you need to do with the value, just do with more |>, it is all the same.”

In Ruby, you say “when you work with collections, you do .method and blocks; when you work with methods object already has, you do .method; when you need debug print in the middle of the chain, you can .tap { p _1 } just like that... But oh, there is also this one nice operator which you can’t mix with anything but it is there too... And it also creates an invisible block like nowhere else, but it is just there for convenience and looking like Elixir, sometimes!”

That’s the major drawback of the proposal in my eyes, and I fail to see a comparably major gain.

Updated by lpogic (Łukasz Pomietło) 2 months ago · Edited

Has "then" but as a keyword been considered?
In the basic version it could appear as a "begin..then..end" block:

value = begin add value, 3 then square it then half it end

It looks like syntax highlighting is ready. "begin" can be replaced with something else, but then it would be harder to prove such forms:

value = begin value
then add it, 3
then square it
then |v| # optional 'it' name?
  half v
rescue # optional error handling?
  puts "Error"
  0
end
def foo(value)
  add value, 3
then
  square it
then
  half it
end

The endless (and beginless) version may be more controversial, but if used with caution it could make sense:

value = add value, 3 then square it then half it

Going further, why couldn't "then" be a LHS result? It has the potential to be a cure for parenthesis headaches:

(1..5).to_a.join("-").then{ puts it } # => 1-2-3-4-5
# ^ == v
1..5 then.to_a.join "-" then puts it # => 1-2-3-4-5

puts (2 + 2 then * 2 - 2 then ** 2) == (((2 + 2) * 2 - 2) ** 2) # => true

Updated by nevans (Nicholas Evans) 2 months ago

I think there are good reasons to want a |> operator in addition to (or instead of) .{}, but foo.{ bar it } is intriguing syntactic sugar. I think I like it. I just noticed that it was rejected by Matz when #yield_self was introduced. But perhaps (when the syntax moratorium has ended) time will have changed his mind? It does seem to have a natural connection to foo.().

But, I would strongly prefer for it to be an alias for #yield_self; not for #then. Maybe that's a subtle distinction. Many rubyists seem to treat #then as a pure alias for #yield_self. But they are not perfect synonyms. When #then was first proposed, Matz specifically mentioned that they have different semantics:

It is introduced that a normal object can behave like promises.
So the name conflict is intentional.
If you really wanted a non-unwrapping method for promises, use yield_self.

In other words, we should not assume that every object implements #then the exact same way. I have a lot of async code that predates Object#then. From a purely linguistic viewpoint, when we're dealing with a object that represents a completable process, the English word "then" strongly implies that the block will only run after the process has completed.

So I treat #yield_self and #then the same way that I treat equal?, eql?, ==, and #===. The fact that all of these behave more-or-less identically on Object is not determinative: classes should override #eql?, #==, and #=== to properly represent the different forms of equality. Likewise, #then should be overridden for any object that represents a completable process. On the other hand, just like #equal?, #yield_self should never be overridden, and it should only occasionally even be used.

I will use #equal? or #yield_self when the semantics fit, even if that particular object doesn't override #== and #then. E.g:

# runs immediately: so "then" is not appropriate
Thread.new do do_stuff end
  .yield_self { register_task_from_thread it }

# waits for `Thread#value`: so "then" is appropriate
Thread.new do do_stuff end
  .then { handle_result it.value }

async { get_result }           # returns a promise
  .then {|result| use result } # probably _also_ returns a promise
  .value                       # unwrap the promise

I do think there is room for a |> operator that is yet another version of this, with slightly different semantics from both #yield_self and #then. But (concerning this proposal) I share @zverok's concern about creating "an invisible block like nowhere else". We should be very careful about adding unique syntax for a single operator.

Updated by AlexandreMagro (Alexandre Magro) about 2 months ago

Reflecting on the opposing points raised, I believe the pipe operator could work differently, avoiding the issue of "implicit blocks" mentioned by @zverok (Victor Shepelev).

As suggested by @Eregon (Benoit Daloze), translating the operator to local variables reduces the overhead associated with chaining .then.

What I (re)propose is to define the pipe operator as a statement separator, similar to ;, where LHS expression is evaluated first and its result is stored in the variable _, which we can call as "last expression result", and then RHS is executed.

For instance, this:

expr_a |> expr_b

Would conceptually translates to:

expr_a => _; expr_b

This way, we could write:

"https://api.github.com/repos/ruby/ruby"
  |> URI.parse(_)
  |> Net::HTTP.get(_)
  |> JSON.parse(_)
  |> _.fetch("stargazers_count")
  |> puts "Ruby has #{_} stars"

This approach maintains clarity, avoids the overhead of multiple .then calls, and introduces the _ variable as the last expression result, similar to the "ANS" button on a calculator.

Updated by AlexandreMagro (Alexandre Magro) about 1 month ago

mame (Yusuke Endoh) wrote in https://bugs.ruby-lang.org/issues/20781#note-9 at DevMeeting:

AlexandreMagro (Alexandre Magro) wrote in #note-8:

  • Improves readability by transforming p(q(r)) into a more natural r |> q |> p, matching how we think.

Do you mean r |> q(_) |> p(_)?

Yes, r |> q |> p was just an abstract notation to explicitly show the order of method calls.

@mame (Yusuke Endoh) I’m replying here because of the "DO NOT discuss then on this ticket, please." mention in the DevMeeting thread.

Updated by lpogic (Łukasz Pomietło) about 1 month ago

What if I need an intermediate result beyond the next step of the method chain?

Cases:

def foo
  r |> q(_) |> p(_)
  return q_result  # "q_result" should be the result of the second step of the chain
end
def foo
  r |> q(_) |> p(_, r_result) # "r_result" should be the result of the first step of the chain
end

Updated by austin (Austin Ziegler) about 1 month ago

lpogic (Łukasz Pomietło) wrote in #note-36:

What if I need an intermediate result beyond the next step of the method chain?

Cases:

def foo
  r |> q(_) |> p(_)
  return q_result  # "q_result" should be the result of the second step of the chain
end
def foo
  r |> q(_) |> p(_, r_result) # "r_result" should be the result of the first step of the chain
end

If you need an intermediate result for any reason, don't use a pipeline, or restructure your return values so that they are returning some sort of context object. In your examples, #tap and #then would be more useful:

def foo
  q(r).tap { p(_1) }
end

def bar
  r
    .then { [_1, q(_1) }
    .then { |(rr, qr)| p(qr, rr) }
end

def baz
  rr = r
  p(q(rr), rr)
end

I use the pipe operator in Elixir extensively, but I don't think that I’ve seen a proposal that would really improve Ruby's syntax over .then { … } except the proposed .{} acting as syntax sugar for .then {…}.

Updated by AlexandreMagro (Alexandre Magro) about 1 month ago

@lpogic (Łukasz Pomietło) In these cases, you wouldn’t use pipes:

def foo
  q_result = q(r) # because q_result is important and deserves its own variable
  
  p(r)

  q_result
end

The pipe operator is useful when you just need to nest method calls and aren’t interested in each individual value. Example:

schema = JSON.parse(File.read(Rails.root.join(RELATIVE_PATH_TO_CONFIG)))

# becomes

schema = Rails.root.join(RELATIVE_PATH_TO_CONFIG) |> File.read _ |> JSON.parse _

# The same written with a .then chain

schema = Rails.root.join(RELATIVE_PATH_TO_CONFIG).then { File.read(_1) }.then { JSON.parse(_1) }

The point is that this shows the natural flow: "Concatenate this string, read the file with that name, and then parse the JSON."

Updated by lpogic (Łukasz Pomietło) about 1 month ago

@austin (Austin Ziegler), @AlexandreMagro (Alexandre Magro) I understand that this is a creature from the functional world. However, I wonder whether the current proposal (statement operator) would allow for such a forms:

def foo
  r |> (q_result = q(_)) |> p(_)
  return q_result
end
# or
def foo
  r |> q_result = q(_) |> p(_)
  return q_result
end

def bar
  (r_result = r) |> q(_) |> p(_, r_result)
end

Updated by austin (Austin Ziegler) about 1 month ago

lpogic (Łukasz Pomietło) wrote in #note-39:

@austin (Austin Ziegler), @AlexandreMagro (Alexandre Magro) I understand that this is a creature from the functional world. However, I wonder whether the current proposal (statement operator) would allow for such a forms:

def foo
  r |> (q_result = q(_)) |> p(_)
  return q_result
end
# or
def foo
  r |> q_result = q(_) |> p(_)
  return q_result
end

def bar
  (r_result = r) |> q(_) |> p(_, r_result)
end

My opposition to this concept in #note-37 stands for the same reasons. This is unreadable and has indeterminate scope.

A pipeline operator is best used for passing the first parameter (like Elixir), the last parameter (some JavaScript .pipe(…) implementations; Go text templates), or an arbitrary parameter with an explicit marker (it, _, _1, etc.).

Ruby already has pipeline-like method, #then. If |> or .{} acts as syntactic sugar for #then, I don't see an issue here. If, internally, it’s turned into the effective equivalent of __pipe1 = r; __pipe2 = q(__pipe1); __pipe3 = p(__pipe2); __pipe3, I don't see an issue here. But under no circumstances do I think that the effective temporary assignments should be exposed or made available to method calls further down the pipeline or after the pipeline is complete. That's too much magic.

Updated by baweaver (Brandon Weaver) about 1 month ago

I'd written on the previous iteration of the pipeline operator a while ago here: https://dev.to/baweaver/ruby-2-7-the-pipeline-operator-1b2d

The ending example of what I thought, at the time, was an ideal state of it was:

double = -> v { v * v }
increment = -> v { v + 1 }

5
|> double
|> increment
|> to_s(2)
|> reverse
|> to_i

...which mixed both methods and procs to effectively pretend that Ruby was a LISP-1 derivative language, if only for the sake of pipelines. I believe that given the LISP-2 nature of the language this would be confusing, and add complexity for not a lot of practical gain compared to the combination of then and it.

Frequently what folks are looking for is a nicer way to say this:

def some_method(v) = v + 1

5.then(&method(:some_method))

...and there have been a few proposals in that spirit before like .::

HTTP.get(some_url).then(&JSON.:parse)

...which I still think is an interesting potential syntax, and when applied to some of the pipeline proposals may become something like this:

HTTP.get(some_url)
  |> JSON.:parse
  |> filter { |k, v| v.is_a?(Integer) }

But again, comparatively speaking there's not a lot of overhead to then and it in those cases:

HTTP.get(some_url)
  .then { JSON.parse(it) }
  .filter { |k, v| v.is_a?(Integer) }

...except to add more syntax that may be unclear for newer Ruby programmers that will be very hard to find documentation for. Even if I would very much like a shorter way to say Object.method(:something) I debate if it would be wise.

Updated by lpogic (Łukasz Pomietło) about 1 month ago

austin (Austin Ziegler) wrote in #note-40:

Ruby already has pipeline-like method, #then. If |> or .{} acts as syntactic sugar for #then, I don't see an issue here. If, internally, it’s turned into the effective equivalent of __pipe1 = r; __pipe2 = q(__pipe1); __pipe3 = p(__pipe2); __pipe3, I don't see an issue here. But under no circumstances do I think that the effective temporary assignments should be exposed or made available to method calls further down the pipeline or after the pipeline is complete. That's too much magic.

Can't the "|>" operator also be considered a cousin of "&&", which is unconditional but passes the LHS result?

p1 = (rr = r) |>       q( _)  |> p( _, rr)
p2 = (rr = r) && (qr = q(rr)) && p(qr, rr) 
p1 == p2 # true if r(), q(), p() never return false/nil

Updated by lpogic (Łukasz Pomietło) about 1 month ago

baweaver (Brandon Weaver) wrote in #note-41:

Frequently what folks are looking for is a nicer way to say this:

def some_method(v) = v + 1

5.then(&method(:some_method))

Some proxy object and method missing mechanism may be the way. Example: https://github.com/lpogic/procify

Updated by AlexandreMagro (Alexandre Magro) about 1 month ago

lpogic (Łukasz Pomietło) wrote in #note-43:

baweaver (Brandon Weaver) wrote in #note-41:

Frequently what folks are looking for is a nicer way to say this:

def some_method(v) = v + 1

5.then(&method(:some_method))

Some proxy object and method missing mechanism may be the way. Example: https://github.com/lpogic/procify

There’s no need for the syntax to take this route; using an explicit variable (last expression result variable "_") provides a clearer and more flexible solution. Languages that use pipes, as previously mentioned, have established standards for how parameters flow through the chain (typically as the first or last argument, depending on the language).

An explicit parameter addresses this, making the usage more intuitive and powerful.

Updated by AlexandreMagro (Alexandre Magro) about 1 month ago

austin (Austin Ziegler) wrote in #note-40:

Ruby already has pipeline-like method, #then. If |> or .{} acts as syntactic sugar for #then, I don't see an issue here. If, internally, it’s turned into the effective equivalent of __pipe1 = r; __pipe2 = q(__pipe1); __pipe3 = p(__pipe2); __pipe3, I don't see an issue here. But under no circumstances do I think that the effective temporary assignments should be exposed or made available to method calls further down the pipeline or after the pipeline is complete. That's too much magic.

I agree with your point. It would indeed be “healthier” not to expose assignments made within the pipeline, as this could be handled as an exception, though I think it might depend on specific implementation details. However, it’s possible to do this with .then:

def bar
  r
    .then { [_1, q(_1)] } # <= keeping q(_1) value as _1[1]   
    .then { |(rr, qr)| p(qr, rr) }
end

Or

x = 5
y = nil

x.then { y = x * 100 } # Here `y` should be previously defined
  
y
# => 500

Updated by lpogic (Łukasz Pomietło) about 1 month ago

I wonder if the pipeline operator with assignment wouldn't also be useful in everyday code:

str = "ABC"
str |>= _.reverse
str # => "CBA"

a_simple_json = '{"key":"value"}'
a_simple_json |>= JSON.parse _
a_simple_json # => {"key"=>"value"}

a = [{greeting: "Hello"}]
a[0][:greeting] |>= "#{_} World!"
a # => [{greeting: "Hello World!"}]

Updated by AlexandreMagro (Alexandre Magro) about 1 month ago

lpogic (Łukasz Pomietło) wrote in #note-46:

I wonder if the pipeline operator with assignment wouldn't also be useful in everyday code:

str = "ABC"
str |>= _.reverse
str # => "CBA"

a_simple_json = '{"key":"value"}'
a_simple_json |>= JSON.parse _
a_simple_json # => {"key"=>"value"}

a = [{greeting: "Hello"}]
a[0][:greeting] |>= "#{_} World!"
a # => [{greeting: "Hello World!"}]

The pipe operator is well-known, but this type of operation with assignment is something I haven’t seen before, and I’m not sure of its precedence. The examples also don’t illustrate a strong need for it.

# 1

str = "ABC".reverse

# 2

a_simple_json = JSON.parse '{"key":"value"}'

In the third example, a dedicated method would be more suitable, I believe a similar proposal was made here: https://bugs.ruby-lang.org/issues/20818

Updated by zverok (Victor Shepelev) 27 days ago

In case anybody interested, I spent some time on Staruday experimenting on an implementation of @AlexandreMagro’s idea: https://zverok.space/blog/2024-11-16-elixir-pipes.html

require 'not_a_pipe'

extend NotAPipe

pipe def repos(username)
  username >>
    "https://api.github.com/users/#{_}/repos" >>
    URI.open >>
    _.read >>
    JSON.parse(symbolize_names: true) >>
    _.map { _1.dig(:full_name) }.first(10) >>
    pp
end

It is a “hack”, but hopefully an interesting one (a macro implemented via AST transformation), and maybe allows to play with different contexts and codebases to see if it fits.

Updated by lpogic (Łukasz Pomietło) 14 days ago

AlexandreMagro (Alexandre Magro) wrote in #note-47:

The pipe operator is well-known, but this type of operation with assignment is something I haven’t seen before, and I’m not sure of its precedence. The examples also don’t illustrate a strong need for it.

I wanted str, a_simple_json and a to be treated as variables, maybe method arguments but impossible to reduce to one line.

The notation a |>= b could be considered syntactic sugar for a = a |> b. Please remember that most binary operators in Ruby have similar syntactic sugar (including || and &&). I don't see why the "pipe" operator should be an exception.

I see the potential for such a feature in situations where we want to perform some operation on an object and replace it with the result. So far, only operations of type foo.a = foo.a / b can be written without explicitly referring to foo.a twice (foo.a /= b). Swapping the order of arguments or using a method instead of an operator breaks the notation. However, using a "pipe" with assignment would give us more freedom, since it would apply to any case where foo.a is on both sides of the assignment: foo.a |>= b / _, foo.a |>= _.div b.

Perhaps this is not an essential issue for the idea itself, but I think it may have an impact on the direction of change.

Another thing I think is worth considering is the conditional "pipe" operator. It could combine features of the "pipe" operator with the &&. Like "pipe" with assignment, it could prevent self-repeating in some cases. Let's assume its notation would be &>:

foo = Struct.new(:bar).new
# Please assume the code above is immutable.

v = foo.bar &> Integer.sqrt _  # No exception here as right side is evaluated only if foo.bar is not false nor nil.
v # => nil

How can I put this more simply?

Updated by austin (Austin Ziegler) 14 days ago

lpogic (Łukasz Pomietło) wrote in #note-49:

The notation a |>= b could be considered syntactic sugar for a = a |> b. Please remember that most binary operators in Ruby have similar syntactic sugar (including || and &&). I don't see why the "pipe" operator should be an exception.

I see the potential for such a feature in situations where we want to perform some operation on an object and replace it with the result. So far, only operations of type foo.a = foo.a / b can be written without explicitly referring to foo.a twice (foo.a /= b). Swapping the order of arguments or using a method instead of an operator breaks the notation. However, using a "pipe" with assignment would give us more freedom, since it would apply to any case where foo.a is on both sides of the assignment: foo.a |>= b / _, foo.a |>= _.div b.

Perhaps this is not an essential issue for the idea itself, but I think it may have an impact on the direction of change.

I do not see how any of this improves the readability of the code. IMO, it does the exact opposite. Remember, the |> proposal here is essentially syntax sugar for .then with an invisible block:

foo.bar |> b / _
foo.bar.then { b / _1 }

Neither of these is as readable as b / foo.bar, and your example introducing |>= is less readable than foo.bar = b / foo.bar, if slightly shorter and substantially less readable.

A pipeline operator, if one is introduced to Ruby (and I don't think Ruby needs one), is a relatively efficient way of carrying the result from one expression into a parameter of the next expression, and is generally considered less readable when there is only one function in the pipeline. That is:

foo |> bar() # less readable
bar(foo) # more readable

baz(bar(foo)) # less readable
foo
|> bar()
|> baz() # morę readable

Another thing I think is worth considering is the conditional "pipe" operator. It could combine features of the "pipe" operator with the &&. Like "pipe" with assignment, it could prevent self-repeating in some cases. Let's assume its notation would be &>:

foo = Struct.new(:bar).new
# Please assume the code above is immutable.

v = foo.bar &> Integer.sqrt _  # No exception here as right side is evaluated only if foo.bar is not false nor nil.
v # => nil

How can I put this more simply?

I don't see how that is more readable than v = foo.bar && Integer.sqrt(foo.bar). I would oppose &> regardless, because |> is not the opposite in the way that && and || are.


As I said, I don't think Ruby needs a pipe operator, but I wonder if a different approach might be taken. In irb, _ is automatically assigned the result of the previous expression (well, expression line; it doesn't carry within an expression). What if that was done for Ruby as a whole? That is, v = foo.bar && Integer.sqrt(_) would be the same as v = foo.bar && Integer.sqrt(foo.bar) because _ would be the value of foo.bar.

Similarly, you could get a pipeline behaviour without any extra syntax elements:

"https://api.github.com/repos/ruby/ruby"
URI.parse(_)
Net::HTTP.get(_)
JSON.parse(_)
_.fetch("stargazers_count")
puts "Ruby has #{_} stars"

It would substantially complicate parsing (one would only want to "assign" _ if an expression uses it), and right now _ is a valid variable name (if usually used for an unused parameter).

Updated by AlexandreMagro (Alexandre Magro) 14 days ago

lpogic (Łukasz Pomietło) wrote in #note-49:

I wanted str, a_simple_json and a to be treated as variables, maybe method arguments but impossible to reduce to one line.

If these variables need to be explicit and cannot be reduced to one line, then you definitely don’t need the pipe operator here. One of the main goals of introducing the pipe operator is to eliminate the need for intermediate variables when they don't add clarity or purpose to the code.

Updated by AlexandreMagro (Alexandre Magro) 14 days ago

austin (Austin Ziegler) wrote in #note-50:

As I said, I don't think Ruby needs a pipe operator, but I wonder if a different approach might be taken. In irb, _ is automatically assigned the result of the previous expression (well, expression line; it doesn't carry within an expression). What if that was done for Ruby as a whole? That is, v = foo.bar && Integer.sqrt(_) would be the same as v = foo.bar && Integer.sqrt(foo.bar) because _ would be the value of foo.bar.

Similarly, you could get a pipeline behaviour without any extra syntax elements:

"https://api.github.com/repos/ruby/ruby"
URI.parse(_)
Net::HTTP.get(_)
JSON.parse(_)
_.fetch("stargazers_count")
puts "Ruby has #{_} stars"

It would substantially complicate parsing (one would only want to "assign" _ if an expression uses it), and right now _ is a valid variable name (if usually used for an unused parameter).

Using the "last expression result" as a global behavior could introduce unnecessary performance overhead, as the interpreter would need to track and update it for every expression.

Furthermore, by explicitly using the pipe operator, the "last expression result" gains a clear meaning on its own, and no longer remains just an anonymous placeholder (_). This avoids the ambiguity that may arise in the code, which isn't an issue when writing expressions line by line in irb.

Updated by austin (Austin Ziegler) 14 days ago

AlexandreMagro (Alexandre Magro) wrote in #note-52:

austin (Austin Ziegler) wrote in #note-50:

It would substantially complicate parsing (one would only want to "assign" _ if an expression uses it), and right now _ is a valid variable name (if usually used for an unused parameter).

Using the "last expression result" as a global behavior could introduce unnecessary performance overhead, as the interpreter would need to track and update it for every expression.

Not necessarily. I don't know much about how the parser works to produce the AST, but if it is able to do a small bit of backtracking, it could detect the use of _ (which would otherwise be an "unused variable") and mark the previous expression as requiring the last expression result. That would mean that the overhead would only exist when used. This sort of backtracking would be required with the use of |> in any case.

Furthermore, by explicitly using the pipe operator, the "last expression result" gains a clear meaning on its own, and no longer remains just an anonymous placeholder (_). This avoids the ambiguity that may arise in the code, which isn't an issue when writing expressions line by line in irb.

I don't entirely agree. The ambiguity still exists because there is (more or less) an implicit block behaviour. If _ already exists in the current scope, both the use of a pipe operator and the implicit "last expression result" would potentially shadow or overwrite the use of _. (It may be a silly idea to use a variable called _, but it is legal to do so right now.)

Updated by AlexandreMagro (Alexandre Magro) 14 days ago

austin (Austin Ziegler) wrote in #note-53:

AlexandreMagro (Alexandre Magro) wrote in #note-52:

austin (Austin Ziegler) wrote in #note-50:

It would substantially complicate parsing (one would only want to "assign" _ if an expression uses it), and right now _ is a valid variable name (if usually used for an unused parameter).

Using the "last expression result" as a global behavior could introduce unnecessary performance overhead, as the interpreter would need to track and update it for every expression.

Not necessarily. I don't know much about how the parser works to produce the AST, but if it is able to do a small bit of backtracking, it could detect the use of _ (which would otherwise be an "unused variable") and mark the previous expression as requiring the last expression result. That would mean that the overhead would only exist when used. This sort of backtracking would be required with the use of |> in any case.

Furthermore, by explicitly using the pipe operator, the "last expression result" gains a clear meaning on its own, and no longer remains just an anonymous placeholder (_). This avoids the ambiguity that may arise in the code, which isn't an issue when writing expressions line by line in irb.

I don't entirely agree. The ambiguity still exists because there is (more or less) an implicit block behaviour. If _ already exists in the current scope, both the use of a pipe operator and the implicit "last expression result" would potentially shadow or overwrite the use of _. (It may be a silly idea to use a variable called _, but it is legal to do so right now.)

Using the "last expression result" as a global behavior offers no real advantage over the pipe operator. The pipe operator provides a predictable and explicit structure: "Developer, the next expression depends on the result of this one." This clarity makes the code easier to read and follow.

In contrast, treating the "last expression result" as a global behavior introduces the potential for confusing and poorly written code. It could lead to situations where, at first glance, a line of code appears self-contained, only to reveal midway through our mental evaluation that it depends on a prior expression. This undermines readability and makes debugging more challenging, especially in larger or collaborative codebases.

The IRB has this behavior, but there we are debugging line by line.

Actions

Also available in: Atom PDF

Like6
Like0Like0Like0Like0Like0Like0Like0Like1Like0Like1Like0Like1Like0Like0Like0Like1Like1Like0Like0Like0Like1Like1Like0Like1Like1Like0Like0Like1Like0Like0Like1Like0Like1Like1Like0Like0Like0Like0Like0Like1Like1Like0Like0Like0Like0Like0Like0Like1Like0Like0Like0Like0Like0Like0