Project

General

Profile

Actions

Feature #18181

open

Introduce Enumerable#min_with_value, max_with_value, and minmax_with_value

Added by kyanagi (Kouhei Yanagita) about 1 month ago. Updated about 1 month ago.

Status:
Open
Priority:
Normal
Assignee:
-
Target version:
-
[ruby-core:105354]

Description

PR is https://github.com/ruby/ruby/pull/4874

I propose Enumerable#min_with_value, max_with_value and minmax_with_value.
These methods work like this:

  %w(abcde fg hijk).min_with_value { |e| e.size } # => ['fg', 2]
  %w(abcde fg hijk).max_with_value { |e| e.size } # => ['abcde', 5]
  %w(abcde fg hijk).minmax_with_value { |e| e.size } # => [['fg', 2], ['abcde', 5]]

Corresponding to #min(n), an integer argument can be passed to #min_with_value or #max_with_value.

  %w(abcde fg hijk).min_with_value(2) { |e| e.size } # => [['fg', 2], ['hijk', 4]]
  %w(abcde fg hijk).max_with_value(2) { |e| e.size } # => [['abcde', 5], ['hijk', 4]]

Motivation

When I use Enumerable#min_by, I sometimes want to get not only the minimum element
but also the value from the given block.
(e.g.: There are many points. Find the nearest point and get distance to it.)

  elem = enum.min_by { |e| foo(e) }
  value = foo(elem)

This works, but I'd like to avoid writing foo() twice. (Consider a more complex case.)

This can be written without repeated foo() like belows, but it is slightly complicated and needs extra arrays.

  value, elem = enum.map { |e| [foo(e), e] }.min_by(&:first)

If the size of enum is enormous, it is hard to use intermediate arrays.

Enumerable#min_with_value solves this problem.

I think min_with_value is the best name I could think of, but any suggestions for better names would be appreciated.

Benchmark

https://bugs.ruby-lang.org/issues/18181#note-2

Example

Solving a traveling salesman problem in nearest neighbor algorithm.

require 'set'
Point = Struct.new(:x, :y)
points = Set.new([Point.new(1, 1), Point.new(2, 4), Point.new(3, 3), Point.new(2, 2), Point.new(0, 1)])

total = 0
current = points.first
points.delete(current)
path = [current]

until points.empty?
  current, distance = points.min_with_value do |point|
    Math.hypot(current.x - point.x, current.y - point.y)
  end
  total += distance
  points.delete(current)
  path << current
end

p path
p total

Updated by sawa (Tsuyoshi Sawada) about 1 month ago

This is a frequently seen use case. I use code like this:

%w(abcde fg hi jkl mn).group_by(&:size).min # => [2, ["fg", "hi", "mn"]]

and I don't see a strong need to shorten it into a single method although I am not strongly against doing so. However, I have some concerns with your proposal:

  1. Its use will be limited if it only picks a single element when the min/max element is not unique. I propose that it should rather hold an array of the corresponding elements.

  2. I don't think the word "value" is appropriate. By this word, what comes to mind most naturally is the min/max value rather than the enumerated element. Perhaps a word like "element(s)" should be used.

  3. I don't think the order of the elements you proposed is appropriate. With each_with_object, the enumerated entity is followed by the accumulating object, i.e., the word order matches the order of the elements within the subarray. I think you should follow this practice.

  4. The methods min/max are used without a block, min_by / max_by are used with a block. And in this case, you are using a block.

Thus, I think your proposal should be modified into something like the following (which looks so similar to what can be done already as I have shown above):

%w(abcde fg hi jkl mn).min_by_with_elements(&:size) # => [2, ["fg", "hi", "mn"]]

Updated by kyanagi (Kouhei Yanagita) about 1 month ago

enum.group_by { ... } holds all of the elements of enum,
so it consumes a lot of memory and is very slow if enum is enormous.

require 'benchmark'
Benchmark.bm(16) do |x|
  enum = -1000000..1000000
  x.report("group_by") { enum.group_by { |i| i.abs }.min }
  x.report("map.min_by") { enum.map { |i| [i.abs, i] }.min_by(&:first) }
  x.report("min_with_value") { enum.min_with_value { |i| i.abs } }
  x.report("each") { enum.each { |i| i.abs } }
end
                       user     system      total        real
group_by           1.238664   0.243779   1.482443 (  1.483547)
map.min_by         0.721992   0.047738   0.769730 (  0.770767)
min_with_value     0.236433   0.004105   0.240538 (  0.240596)
each               0.150815   0.000000   0.150815 (  0.151704)

#min and #min_by return single element even through there are multiple smallest elements.
There may be cases where it is useful to return multiple elements,
but single element will be enough for most cases.
Writing elements.first everytime is a little bit of a pain.
Besides, size of the elements may become very large. (if almost all of the elements are mapped into the same value)

I think #min_with_value should return the element as primary (i.e. the element is first, the block's value is second)
because #min and #min_by return the element.
Additionally, this is similar to the way #each_with_index and #each_with_object pass their block arguments.

enum.each_with_index do |elem, i| # <- the element is first, the other is second.
end

enum.each_with_object([]) do |elem, obj| # <- the element is first, the other is second.
end

Naming issue:

The behavior of this method is described as "The minimum element by the block value and its value".
Based on this, the name will be something like min_element_by_value_and_value or min_element_by_value_with_value if verbosely named.

Considering the method #min which returns the element, the word "element" is thought to be able to omitted.
So I suggest the short name #min_with_value. (I think #min_by_with_value is also possible.)
The word "with" implies that "min(_element)" and "value" are different objects.

Actions #3

Updated by kyanagi (Kouhei Yanagita) about 1 month ago

  • Description updated (diff)

Updated by kyanagi (Kouhei Yanagita) about 1 month ago

I added an example code using #min_with_value.

Updated by Dan0042 (Daniel DeLorme) about 1 month ago

I like this proposal, but I think sawa (Tsuyoshi Sawada)'s idea has value also. It would be simple to accommodate both uses with a single syntax:

%w(abcde fg xx hijk).min_with_value(&:size) # => [2, 'fg', 'xx']
v,elem = %w(abcde fg xx hijk).min_with_value(&:size)   #in the normal case where you only want the 'first' minimum
v,*elems = %w(abcde fg xx hijk).min_with_value(&:size) #when you want all of them
Actions #6

Updated by kyanagi (Kouhei Yanagita) about 1 month ago

  • Description updated (diff)

Updated by kyanagi (Kouhei Yanagita) about 1 month ago

Description about #min_with_value(n) added. It corresponds to #min(n).

Updated by Eregon (Benoit Daloze) about 1 month ago

:+1: I've needed this several times.
I think the semantics in the description are the best and clearest.

Actions

Also available in: Atom PDF