Bug #11471
closedmin, min_by, max, max_by with optional parameter return a wrong value
Added by sawa (Tsuyoshi Sawada) about 10 years ago. Updated about 10 years ago.
Description
This is reported in StackOverflow: http://stackoverflow.com/questions/32121749/why-20-13-14-min2-13-20. Sometimes min
, min_by
, max
, max_by
with an optional parameter return a wrong value.
[20, 32, 32, 21, 30, 25, 29, 13, 14].min(2) # => [13, 20]
[20, 32, 32, 21, 30, 25, 29, 13, 14].min_by(2, &:itself) # => [13, 20]
[0, 0, 0, 0, 0, 0, 1, 3, 2].max(2) # => [3, 1]
[0, 0, 0, 0, 0, 0, 1, 3, 2].max_by(2, &:itself) # => [3, 1]
Files
enum_nmin_filter_fix.patch (2.42 KB) enum_nmin_filter_fix.patch | helfper (Helder Pereira), 08/21/2015 01:31 AM |
Updated by tonkpils (Leo Correa) about 10 years ago
Actions
#1
- File enum_bug_fix.patch added
I have made a patch that seems to fix the issue for these cases. I'm not entirely sure why there was a multiplier for data.bufmax in the first place but it seemed to be the cause.
Updated by helfper (Helder Pereira) about 10 years ago
Actions
#2
Leo Correa, I think your patch will fail for this example:
[2, 4, 8, 6, 7].min(4) #=> [2, 4, 6, 8]
Updated by helfper (Helder Pereira) about 10 years ago
Actions
#3
I suppose I found the problem. Taking the first example:
[20, 32, 32, 21, 30, 25, 29, 13, 14].min(2)
This will call the function "nmin_run" in the file "enum.c", which sets "bufmax" to 4 times the number of minimums (n) we want (for the example, bufmax is 8), and then in the line 1327 will call the function "nmin_i" for each element of the original array.
In the function "nmin_i", when the buffer is full ("data->curlen == data->bufmax"), the function "nmin_filter" is called. In the example, that happens when curlen is 8, and so the buffer is [20, 32, 32, 21, 30, 25, 29, 13]. The "nmin_filter" will do a quicksort until the n smallest elements so far are on the leftmost part of the buffer, and will discard the rest of the elements, which leaves us with [20, 13] in the buffer.
And now starts the problem. At the end of "nmin_filter" the limit (apparently with the intention of storing the greatest value in the buffer) is set to the last value in the buffer (in the example, 13), which is not true. And then based on that value "nmin_i" will discard all remaining elements greater than that (in the example, discarding the 14). The buffer is then sorted and it returns:
[13, 20]
So the solution is either remove all the limit-related part, or take the last pivot as the limit.
Updated by tonkpils (Leo Correa) about 10 years ago
Actions
#4
Helder Pereira wrote:
Leo Correa, I think your patch will fail for this example:
[2, 4, 8, 6, 7].min(4) #=> [2, 4, 6, 8]
I made a test case for that and passed with [2, 4, 6, 7]
Updated by tonkpils (Leo Correa) about 10 years ago
Actions
#5
- File enum_bug_fix.patch added
Updated by helfper (Helder Pereira) about 10 years ago
Actions
#6
Yes, I was wrong. I missed this check in "nmin_filter":
if (data->curlen <= data->n)
return;
Which means that your patch will never execute the "nmin_filter" (that is where the bug lays) until it is called in "nmin_run", and consequently the buffer will grow until the size of the original array (and will not be proportionally to the number of elements we want anymore).
Because your patch makes this piece of code in "nmin_i" useless:
if (data->curlen == data->bufmax) {
nmin_filter(data);
}
Updated by tonkpils (Leo Correa) about 10 years ago
Actions
#7
- File deleted (
enum_bug_fix.patch)
Updated by nkmrya (Yasuhiro Nakamura) about 10 years ago
Actions
#8
I found the ticket which to request and discuss this feature. Thicket is #8887.
Updated by helfper (Helder Pereira) about 10 years ago
Actions
#9
Here is my fix.
I also created a pull request:
https://github.com/ruby/ruby/pull/1005
Updated by tonkpils (Leo Correa) about 10 years ago
Actions
#10
- File deleted (
enum_bug_fix.patch)
Updated by nagachika (Tomoyuki Chikanaga) about 10 years ago
Actions
#11
- Status changed from Open to Assigned
- Assignee set to akr (Akira Tanaka)
- Backport changed from 2.0.0: UNKNOWN, 2.1: UNKNOWN, 2.2: UNKNOWN to 2.0.0: DONTNEED, 2.1: DONTNEED, 2.2: REQUIRED
Thank you for your report and patches.
I've confirmed that the issue exists in ruby-2.2.3 too.
akr san, please review the PR? https://github.com/ruby/ruby/pull/1005
Updated by funny_falcon (Yura Sokolov) about 10 years ago
Actions
#12
Algorithm should be based on a heap.
Updated by akr (Akira Tanaka) about 10 years ago
Actions
#13
- Status changed from Assigned to Closed
Applied in changeset r52026.
- enum.c (nmin_filter): Fix limit value.
patch by Helder Pereira.
[Bug #11471] [ruby-core:70477]
Updated by nagachika (Tomoyuki Chikanaga) about 10 years ago
Actions
#14
- Backport changed from 2.0.0: DONTNEED, 2.1: DONTNEED, 2.2: REQUIRED to 2.0.0: DONTNEED, 2.1: DONTNEED, 2.2: DONE
Backported into ruby_2_2
branch at r52032.