Actions

Feature #18181

open

Introduce Enumerable#min_with_value, max_with_value, and minmax_with_value

Status:
Open
Priority:
Normal
Assignee:
-
Target version:
-
[ruby-core:105354]

Description

I propose `Enumerable#min_with_value`, `max_with_value` and `minmax_with_value`.
These methods work like this:

``````  %w(abcde fg hijk).min_with_value { |e| e.size } # => ['fg', 2]
%w(abcde fg hijk).max_with_value { |e| e.size } # => ['abcde', 5]
%w(abcde fg hijk).minmax_with_value { |e| e.size } # => [['fg', 2], ['abcde', 5]]
``````

Corresponding to `#min(n)`, an integer argument can be passed to `#min_with_value` or `#max_with_value`.

``````  %w(abcde fg hijk).min_with_value(2) { |e| e.size } # => [['fg', 2], ['hijk', 4]]
%w(abcde fg hijk).max_with_value(2) { |e| e.size } # => [['abcde', 5], ['hijk', 4]]
``````

Motivation¶

When I use `Enumerable#min_by`, I sometimes want to get not only the minimum element
but also the value from the given block.
(e.g.: There are many points. Find the nearest point and get distance to it.)

``````  elem = enum.min_by { |e| foo(e) }
value = foo(elem)
``````

This works, but I'd like to avoid writing foo() twice. (Consider a more complex case.)

This can be written without repeated foo() like belows, but it is slightly complicated and needs extra arrays.

``````  value, elem = enum.map { |e| [foo(e), e] }.min_by(&:first)
``````

If the size of enum is enormous, it is hard to use intermediate arrays.

`Enumerable#min_with_value` solves this problem.

I think `min_with_value` is the best name I could think of, but any suggestions for better names would be appreciated.

Benchmark¶

https://bugs.ruby-lang.org/issues/18181#note-2

Example¶

Solving a traveling salesman problem in nearest neighbor algorithm.

``````require 'set'
Point = Struct.new(:x, :y)
points = Set.new([Point.new(1, 1), Point.new(2, 4), Point.new(3, 3), Point.new(2, 2), Point.new(0, 1)])

total = 0
current = points.first
points.delete(current)
path = [current]

until points.empty?
current, distance = points.min_with_value do |point|
Math.hypot(current.x - point.x, current.y - point.y)
end
total += distance
points.delete(current)
path << current
end

p path
p total
``````

This is a frequently seen use case. I use code like this:

``````%w(abcde fg hi jkl mn).group_by(&:size).min # => [2, ["fg", "hi", "mn"]]
``````

and I don't see a strong need to shorten it into a single method although I am not strongly against doing so. However, I have some concerns with your proposal:

1. Its use will be limited if it only picks a single element when the min/max element is not unique. I propose that it should rather hold an array of the corresponding elements.

2. I don't think the word "value" is appropriate. By this word, what comes to mind most naturally is the min/max value rather than the enumerated element. Perhaps a word like "element(s)" should be used.

3. I don't think the order of the elements you proposed is appropriate. With `each_with_object`, the enumerated entity is followed by the accumulating object, i.e., the word order matches the order of the elements within the subarray. I think you should follow this practice.

4. The methods `min`/`max` are used without a block, `min_by` / `max_by` are used with a block. And in this case, you are using a block.

Thus, I think your proposal should be modified into something like the following (which looks so similar to what can be done already as I have shown above):

``````%w(abcde fg hi jkl mn).min_by_with_elements(&:size) # => [2, ["fg", "hi", "mn"]]
``````

Updated by kyanagi (Kouhei Yanagita)about 1 year ago

`enum.group_by { ... }` holds all of the elements of enum,
so it consumes a lot of memory and is very slow if enum is enormous.

``````require 'benchmark'
Benchmark.bm(16) do |x|
enum = -1000000..1000000
x.report("group_by") { enum.group_by { |i| i.abs }.min }
x.report("map.min_by") { enum.map { |i| [i.abs, i] }.min_by(&:first) }
x.report("min_with_value") { enum.min_with_value { |i| i.abs } }
x.report("each") { enum.each { |i| i.abs } }
end
``````
``````                       user     system      total        real
group_by           1.238664   0.243779   1.482443 (  1.483547)
map.min_by         0.721992   0.047738   0.769730 (  0.770767)
min_with_value     0.236433   0.004105   0.240538 (  0.240596)
each               0.150815   0.000000   0.150815 (  0.151704)
``````

`#min` and `#min_by` return single element even through there are multiple smallest elements.
There may be cases where it is useful to return multiple elements,
but single element will be enough for most cases.
Writing `elements.first` everytime is a little bit of a pain.
Besides, size of the elements may become very large. (if almost all of the elements are mapped into the same value)

I think `#min_with_value` should return the element as primary (i.e. the element is first, the block's value is second)
because `#min` and `#min_by` return the element.
Additionally, this is similar to the way `#each_with_index` and `#each_with_object` pass their block arguments.

``````enum.each_with_index do |elem, i| # <- the element is first, the other is second.
end

enum.each_with_object([]) do |elem, obj| # <- the element is first, the other is second.
end
``````

Naming issue:

The behavior of this method is described as "The minimum element by the block value and its value".
Based on this, the name will be something like `min_element_by_value_and_value` or `min_element_by_value_with_value` if verbosely named.

Considering the method `#min` which returns the element, the word "element" is thought to be able to omitted.
So I suggest the short name `#min_with_value`. (I think `#min_by_with_value` is also possible.)
The word "with" implies that "min(_element)" and "value" are different objects.

Actions #3

Updated by kyanagi (Kouhei Yanagita)about 1 year ago

• Description updated (diff)

Updated by kyanagi (Kouhei Yanagita)about 1 year ago

I added an example code using `#min_with_value`.

Updated by Dan0042 (Daniel DeLorme)about 1 year ago

I like this proposal, but I think @sawa's idea has value also. It would be simple to accommodate both uses with a single syntax:

``````%w(abcde fg xx hijk).min_with_value(&:size) # => [2, 'fg', 'xx']
v,elem = %w(abcde fg xx hijk).min_with_value(&:size)   #in the normal case where you only want the 'first' minimum
v,*elems = %w(abcde fg xx hijk).min_with_value(&:size) #when you want all of them
``````
Actions #6

Updated by kyanagi (Kouhei Yanagita)about 1 year ago

• Description updated (diff)

Updated by kyanagi (Kouhei Yanagita)about 1 year ago

Description about `#min_with_value(n)` added. It corresponds to `#min(n)`.

Updated by Eregon (Benoit Daloze)about 1 year ago

:+1: I've needed this several times.
I think the semantics in the description are the best and clearest.

Updated by matz (Yukihiro Matsumoto)12 months ago

Just for confirmation, is it OK that `min_with_value` returns only a single value?
I don't like the name `_with_value`. It explains that the method returns an additional value with the minimum/maximum value, but no additional information on the value.

Matz.

Updated by Dan0042 (Daniel DeLorme)12 months ago

What about `min_with` ?

``````%w(abcde fg hijk).min_with(&:size) # => ['fg', 2]
#     ^min  ^size
``````

Updated by cvss (Kirill Vechera)12 months ago

There's also a frequent similar problem with `#find` when you need to find the first matched value instead of entry. But since it involves two semantically different code parts, it a bit more complex and cannot be implemented nicely with the same approach:

``````%w(abcde fg hijk).find_with_value { |e| (size = e.size).even? && size } # => ['fg', 2]
``````

We can create some lazy mapping enumerator providing pairs `[entry, value]`, which we unroll in the second chain just as much as we need and without repeating the computation of value:

``````module Enumerable
def with_map &block
map{|e| [e, block.call(e)]}
end
end

%w(abcde fg hijk).lazy.with_map(&:size).find{_2.even?} # => ['fg', 2]
``````

Back to `#min_with_value`, it could be written as:

``````%w(abcde fg hijk).lazy.with_map(&:size).min_by(&:last) # => ['fg', 2]
``````

We can add some shorthand methods specific to this enumerator like these:

``````%w(abcde fg hijk).lazy.with_map(&:size).min_by_value # => ['fg', 2]
%w(abcde fg hijk).lazy.with_map(&:size).min # => ['fg', 2] - #min is overwritten as an alias to min_by_value
%w(abcde fg hijk).lazy.with_map(&:size).find_by_value(&:even?) # => ['fg', 2]
%w(abcde fg hijk).lazy.with_map(&:size).find(&:even?) # => ['fg', 2] - #find is overwritten as an alias to find_by_value
``````

Updated by kyanagi (Kouhei Yanagita)12 months ago

matz (Yukihiro Matsumoto) wrote in #note-9:

Just for confirmation, is it OK that `min_with_value` returns only a single value?

I think that returning a single value will suit most cases.
It matches that `min` also returns a single value.
But if it is considered that returning multiple values is better, I will not disagree with it.

Updated by ko1 (Koichi Sasada)12 months ago

Another idea (evaluate a given block with a result):

``````class Enumerator
def with_value(&b)
v = each(&b)
[v, b.call(v)]
end
end

pp %w(abcde fg hijk).min_by.with_value{|e| e.size}
#=> ["fg", 2]
``````

It is general for `..._by` methods. If given block has side-effect or consumes huge time, it is not acceptable, though.

Updated by shan (Shannon Skipper)11 months ago

For a single value, a "then" suffix might be nice.

``````%w(abcde fg hijk).min_by(&:size)
#=> "fg"

%w(abcde fg hijk).min_by_then(&:size)
#=> 2
``````

It mirrors #min_by and #then with the same block.

``````%w(abcde fg hijk).min_by(&:size).then(&:size)
#=> 2
``````

Updated by baweaver (Brandon Weaver)11 months ago

I like the more generic suggestion from @ko1 (Koichi Sasada) on `with_value`, I had a similar idea before reading it when @shan (Shannon Skipper) pinged me about this ticket.

The other potential is using the `_map` suffix as is the case with `filter_map`:

``````%w(abcde fg hijk).min_map { |e| e.size }
# => ['fg', 2]

%w(abcde fg hijk).max_map { |e| e.size }
# => ['abcde', 5]

%w(abcde fg hijk).minmax_map { |e| e.size }
# => [['fg', 2], ['abcde', 5]]

%w(abcde fg hijk).min_map(2) { |e| e.size }
# => [['fg', 2], ['hijk', 4]]

%w(abcde fg hijk).max_map(2) { |e| e.size }
# => [['abcde', 5], ['hijk', 4]]
``````

This has precedence, though I do wonder how many functions we accumulate in Enumerable that are effectively the composition of two or more Enumerable methods.

More musing, and in a general sense, I'm reminded of Elixir extracting Enumerable-like methods into `Enum` and allowing potential composition and the rejected prior syntax of `Enumerable.:method` which ties into a fairly old post on my hopes for the also rejected pipeline operator that was proposed.

I may write a bit on that general idea, but I do see great value in the general concept of `min_with_value` as I myself have needed it a few times in the past.

Updated by kyanagi (Kouhei Yanagita)10 months ago

Another name proposal:

When I write `array.min_with_value { foo }`, I mean "minimize foo with an element in array".

So `array.minimizing { foo }` (and `maximizing`) may be the candidate.

Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0