Project

General

Profile

Actions

Bug #11471

closed

min, min_by, max, max_by with optional parameter return a wrong value

Added by sawa (Tsuyoshi Sawada) over 9 years ago. Updated about 9 years ago.

Status:
Closed
Target version:
-
[ruby-core:<unknown>]

Description

This is reported in StackOverflow: http://stackoverflow.com/questions/32121749/why-20-13-14-min2-13-20. Sometimes min, min_by, max, max_by with an optional parameter return a wrong value.

[20, 32, 32, 21, 30, 25, 29, 13, 14].min(2) # => [13, 20]
[20, 32, 32, 21, 30, 25, 29, 13, 14].min_by(2, &:itself) # => [13, 20]
[0, 0, 0, 0, 0, 0, 1, 3, 2].max(2) # => [3, 1]
[0, 0, 0, 0, 0, 0, 1, 3, 2].max_by(2, &:itself) # => [3, 1]

Files

enum_nmin_filter_fix.patch (2.42 KB) enum_nmin_filter_fix.patch helfper (Helder Pereira), 08/21/2015 01:31 AM
Actions #1

Updated by tonkpils (Leo Correa) over 9 years ago

  • File enum_bug_fix.patch added

I have made a patch that seems to fix the issue for these cases. I'm not entirely sure why there was a multiplier for data.bufmax in the first place but it seemed to be the cause.

Actions #2

Updated by helfper (Helder Pereira) over 9 years ago

Leo Correa, I think your patch will fail for this example:

[2, 4, 8, 6, 7].min(4) #=> [2, 4, 6, 8]
Actions #3

Updated by helfper (Helder Pereira) over 9 years ago

I suppose I found the problem. Taking the first example:

[20, 32, 32, 21, 30, 25, 29, 13, 14].min(2)

This will call the function "nmin_run" in the file "enum.c", which sets "bufmax" to 4 times the number of minimums (n) we want (for the example, bufmax is 8), and then in the line 1327 will call the function "nmin_i" for each element of the original array.

In the function "nmin_i", when the buffer is full ("data->curlen == data->bufmax"), the function "nmin_filter" is called. In the example, that happens when curlen is 8, and so the buffer is [20, 32, 32, 21, 30, 25, 29, 13]. The "nmin_filter" will do a quicksort until the n smallest elements so far are on the leftmost part of the buffer, and will discard the rest of the elements, which leaves us with [20, 13] in the buffer.

And now starts the problem. At the end of "nmin_filter" the limit (apparently with the intention of storing the greatest value in the buffer) is set to the last value in the buffer (in the example, 13), which is not true. And then based on that value "nmin_i" will discard all remaining elements greater than that (in the example, discarding the 14). The buffer is then sorted and it returns:

[13, 20]

So the solution is either remove all the limit-related part, or take the last pivot as the limit.

Actions #4

Updated by tonkpils (Leo Correa) over 9 years ago

Helder Pereira wrote:

Leo Correa, I think your patch will fail for this example:

[2, 4, 8, 6, 7].min(4) #=> [2, 4, 6, 8]

I made a test case for that and passed with [2, 4, 6, 7]

Actions #5

Updated by tonkpils (Leo Correa) over 9 years ago

  • File enum_bug_fix.patch added
Actions #6

Updated by helfper (Helder Pereira) over 9 years ago

Yes, I was wrong. I missed this check in "nmin_filter":

    if (data->curlen <= data->n)
        return;

Which means that your patch will never execute the "nmin_filter" (that is where the bug lays) until it is called in "nmin_run", and consequently the buffer will grow until the size of the original array (and will not be proportionally to the number of elements we want anymore).

Because your patch makes this piece of code in "nmin_i" useless:

    if (data->curlen == data->bufmax) {
        nmin_filter(data);
    }
Actions #7

Updated by tonkpils (Leo Correa) over 9 years ago

  • File deleted (enum_bug_fix.patch)
Actions #8

Updated by nkmrya (Yasuhiro Nakamura) over 9 years ago

I found the ticket which to request and discuss this feature. Thicket is #8887.

Actions #10

Updated by tonkpils (Leo Correa) over 9 years ago

  • File deleted (enum_bug_fix.patch)
Actions #11

Updated by nagachika (Tomoyuki Chikanaga) over 9 years ago

  • Status changed from Open to Assigned
  • Assignee set to akr (Akira Tanaka)
  • Backport changed from 2.0.0: UNKNOWN, 2.1: UNKNOWN, 2.2: UNKNOWN to 2.0.0: DONTNEED, 2.1: DONTNEED, 2.2: REQUIRED

Thank you for your report and patches.

I've confirmed that the issue exists in ruby-2.2.3 too.

akr san, please review the PR? https://github.com/ruby/ruby/pull/1005

Actions #12

Updated by funny_falcon (Yura Sokolov) over 9 years ago

Algorithm should be based on a heap.

Actions #13

Updated by akr (Akira Tanaka) about 9 years ago

  • Status changed from Assigned to Closed

Applied in changeset r52026.


Actions #14

Updated by nagachika (Tomoyuki Chikanaga) about 9 years ago

  • Backport changed from 2.0.0: DONTNEED, 2.1: DONTNEED, 2.2: REQUIRED to 2.0.0: DONTNEED, 2.1: DONTNEED, 2.2: DONE

Backported into ruby_2_2 branch at r52032.

Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0