Project

General

Profile

Actions

Feature #22132

open

Scala-like for comprehensions

Feature #22132: Scala-like for comprehensions

Added by shugo (Shugo Maeda) 3 days ago. Updated about 11 hours ago.

Status:
Open
Assignee:
-
Target version:
-
[ruby-core:125855]

Description

Abstract

How about adding an expression form of for that desugars into nested flat_map/map and filter calls.

Here's an example, which computes all pairs of numbers between 0 and n-1 whose sum is equal to a given value v:

def foo(n, v)
  for i in 0...n,
      j in 0...n when i + j == v then
    [i, j]
  end
end

p foo(10, 10) #=> [[1, 9], [2, 8], [3, 7]...]

The above code is desugared as follows:

def foo(n, v)
  (0...n).flat_map { |i|
    (0...n).filter { |j|
      i + j == v
    }.map { |j|
      [i, j]
    }
  }
end

p foo(10, 10)

Background and Motivation

Some other languages have syntactic sugar that flattens nested code.

For example, Scala has for comprehensions:

def foo(n: Int, v: Int) =
   for i <- 0 until n
       j <- 0 until n if i + j == v
   yield (i, j)

Haskell has do notation:

foo :: Int -> Int -> [(Int, Int)]
foo n v = do
  i <- [0 .. n-1]
  j <- [0 .. n-1]
  guard (i + j == v)
  pure (i, j)

Blocks are often nested deeply in Ruby, so such syntactic sugar is useful.

Use cases

For comprehensions can be used to flatten nested blocks.

For example,

(1..).lazy.flat_map { |z|
  (1..z).lazy.flat_map { |x|
    (x..z).lazy.filter { |y|
      x**2 + y**2 == z**2
    }.map { |y|
      [x, y, z]
    }
  }
}.take(3).force

can be flattened as follows:

for z in (1..).lazy,
    x in (1..z).lazy,
    y in (x..z).lazy when x**2 + y**2 == z**2 then
  [x, y, z]
end.take(3).force

For comprehensions can be used not only for Enumerable objects, but also for other objects with lawful flat_map and map (i.e., any monad whose map is the functor map derived from flat_map). For example, flat_map and map must cohere so that nested calls can be flattened:

m.flat_map(&f).map(&g) == m.flat_map { |x| f.(x).map(&g) }

Why then and when?

Scala's yield conflicts with the existing yield keyword in Ruby, so I chose then.
While a bare for ... then (no guard) reads a little unnaturally in English, Ruby already gives then a value-producing meaning (e.g., Kernel#then), so I consider it acceptable.

The guard keyword is not if but when, because if after the source would be ambiguous with the modifier if.

Limitations

The right operand of in is arg_value, not expr_value, to avoid conflicts, so unparenthesized method calls (command calls) must be parenthesized.

Backward compatibility

  • for x in xs do ... end (and the newline form) is unchanged: it still iterates via each and returns the collection.
  • All of the new forms (for ... then, for ... ,, for ... when) were SyntaxError before, so no existing program changes meaning.

Implementation

PoC: https://github.com/ruby/ruby/pull/17500

It's currently implemented only in parse.y, not in Prism yet, so requires --parser=parse.y.

Open questions

  • Variable scope: the loop variables currently leak, same as for ... do. Should a comprehension instead scope them like a block?
  • Guards desugar to filter, not a lazy withFilter as in Scala. So in the strict case, each guard builds one intermediate array; users who want fusion can add .lazy. Is filter acceptable, or should we add a with_filter-like method?

Related issues 1 (1 open0 closed)

Related to Ruby - Feature #16147: List Comprehensions in RubyOpenActions

Updated by shugo (Shugo Maeda) 3 days ago Actions #1

  • Description updated (diff)

Updated by shugo (Shugo Maeda) 3 days ago Actions #2

  • Description updated (diff)

Updated by shugo (Shugo Maeda) 3 days ago Actions #3

Updated by zverok (Victor Shepelev) 2 days ago 1Actions #4 [ruby-core:125858]

TBH, in all my years with Ruby, it felt for me as its deep characteristic that "you never need for, think in enumerables."

I believe that introducing "for that is useful for some complex cases" will dilute this image. Ruby is almost the only mainstream language that doesn't use for for most of the cases, and it is the first thing to be taught to newcomers: "you'll hardly need for, try to think in Enumerable methods."

So, when met with a riddle like this, I prefer to start considering "what our Enumerables are missing, which would make it more convenient without introducing a new style."

For the first case in the ticket, the answer seems to be #product:

def foo(n, v)
  (0...n).product(0...n).filter { _1 + _2 == v }.map { [_1, _2] }
end

p foo(10, 10)
#=> [[1, 9], [2, 8], [3, 7], [4, 6], [5, 5], [6, 4], [7, 3], [8, 2], [9, 1]]

With the second example (inner iteration depends on outer) product isn't relevant, so I'd probably kept it in the "produce the sequence, then filter it" form:

(1..).lazy.flat_map { |z| (1..z).lazy.flat_map { |x| (x..z).lazy.map { |y| [x, y, z] } } }
  .filter { |x, y, z| x**2 + y**2 == z**2 }.take(3).force

I agree it is a bit cumbersome, but I believe that 3+ nested iterations is a rather rare/edge case; and still, we can have here a Ruby-ish "produce sequence, then filter sequence" structure, which is combinable to everything else (for example, filter condition can come here as a parameter/variable.

PS: As a side note, the product solution wouldn't work in a form I wrote it, because for some reason we don't have Enumerable#product or Enumerator::Lazy#product, and only Enumerator.product. Various forms were proposed several times (#19324, #17312, #14399), but it hasn't moved forward yet.

Updated by shugo (Shugo Maeda) 2 days ago Actions #5

  • Description updated (diff)

Updated by shugo (Shugo Maeda) 2 days ago Actions #6 [ruby-core:125861]

Thanks for the feedback.

I agree that "you rarely need for, think in Enumerable" is core to how Ruby is taught, and I don't want to weaken that.

But I think the "what is Enumerable missing?" framing misses the point here. This proposal is not about bringing back the imperative for loop. It is about removing deep nesting from code that is already written in Enumerable style. The desugared form is plain flat_map/map/filter, the same idiom you recommend, just written flat instead of nested.

The benefit is also not limited to Enumerable. Any type with lawful flat_map/map works. Parser combinators are a classic example: https://www.scala-lang.org/api/2.12.3/scala-parser-combinators/scala/util/parsing/combinator/Parsers$Parser.html

Even a type that is not strictly a monad can be useful. For example, an auto_close wrapper whose flat_map/map yields a resource and closes it in an ensure:

for f1 in auto_close(File.open("a")),
    f2 in auto_close(File.open("b")) then
  f1.read + f2.read
end

This opens several resources without nesting, as a flat alternative to nested File.open blocks. It closes f2 then f1 reliably, even on exceptions.

One of Ruby's strengths is that it does not force a single style, so I think offering a for-comprehension form as one option among many is fine. It does not replace method chains; both can coexist.

This is a question of language design, so I would like to hear Matz's opinion.

Updated by Eregon (Benoit Daloze) 2 days ago ยท Edited Actions #7 [ruby-core:125862]

I agree with @zverok (Victor Shepelev), this looks very much not Ruby-like to me.
I also see no need for it.
It feels more like Python or Scala.
Python has for comprehensions because it doesn't even have a decent map or filter (only as a top-level function and Python lambda are very limited).

My take when seeing for comprehensions in other languages is:

  • Either they are simple and would be as clear/clearer as a simple map/filter/filter_map.
  • Or they are nested and I find them completely unreadable (e.g. with 2+ levels the order is ambiguous). The nesting might not always look pretty but it makes it much clearer and more readable. I believe readability of the code is more important than how pretty it looks.

If there is a lot of nesting, purely functional tends to get very complicated, there is a point where just e.g. appending to an Array declared outside is more readable and easier to follow.

Regarding the first example, I find it illustrative of the lack of need of for` comprehensions that it does something so inefficient, it should simply be:

def foo(n, v)
  (1...[n, v].min).filter_map { |i| [i, v-i] if v-i < n }
end
foo(10, 10)

Updated by ko1 (Koichi Sasada) about 14 hours ago Actions #8 [ruby-core:125865]

just curious: for ... then for single iterator will makes map behavior like that?

for a in ary do
  a
end
# ary.each{it}

for a in ary then
  a
end
# ary.map{it}

Updated by ko1 (Koichi Sasada) about 14 hours ago Actions #9 [ruby-core:125866]

also this proposal contains extension like for iter1, iter2 do ... end?

Updated by shugo (Shugo Maeda) about 11 hours ago Actions #10 [ruby-core:125867]

ko1 (Koichi Sasada) wrote in #note-8:

just curious: for ... then for single iterator will makes map behavior like that?

Yes, a single iterator for ... then desugars to map.
That's why we need then instead of do.

also this proposal contains extension like for iter1, iter2 do ... end?

Currently no, but we could allow for i1 in iter1, i2 in iter2 do ... end and desugar it into nested each (iter1.each { |i1| iter2.each { |i2| ... } }).

Actions

Also available in: PDF Atom