Project

General

Profile

Bug #16022

Enumerable#sort_by returns a swapped first and last element when the sort value is identical and collection is >= 8 elements

Added by ryanflach (Ryan Flach) over 1 year ago. Updated over 1 year ago.

Status:
Rejected
Priority:
Normal
Assignee:
-
Target version:
-
ruby -v:
2.5.5p157
[ruby-core:93918]

Description

Overview
I've seen in the source code that the ordering may be unpredictable (https://github.com/ruby/ruby/blob/515e106fb73a1a3bc69b6f1670bbaaebf45fee30/enum.c#L1159-L1160), but it seems to consistently swap the first and last element when there are 8 or more elements.

Ruby version
ruby 2.5.5p157 (2019-03-15 revision 67260) [x86_64-darwin18]

To reproduce:

[1] pry(main)> [{a: 1, b: 1}, {a: 1, b: 2}, {a: 1, b: 3}, {a: 1, b: 4}, {a: 1, b: 5}, {a: 1, b: 6}, {a: 1, b: 7}, {a: 1, b: 8}].sort_by { |h| h[:a] }
=> [{:a=>1, :b=>8}, {:a=>1, :b=>2}, {:a=>1, :b=>3}, {:a=>1, :b=>4}, {:a=>1, :b=>5}, {:a=>1, :b=>6}, {:a=>1, :b=>7}, {:a=>1, :b=>1}]
[2] pry(main)> [{a: 1, b: 1}, {a: 1, b: 2}, {a: 1, b: 3}, {a: 1, b: 4}, {a: 1, b: 5}, {a: 1, b: 6}, {a: 1, b: 7}].sort_by { |h| h[:a] }
=> [{:a=>1, :b=>1}, {:a=>1, :b=>2}, {:a=>1, :b=>3}, {:a=>1, :b=>4}, {:a=>1, :b=>5}, {:a=>1, :b=>6}, {:a=>1, :b=>7}]

I would expect the above to remain in order, when a sort has not actually taken place.

Updated by jeremyevans0 (Jeremy Evans) over 1 year ago

  • Status changed from Open to Rejected

The current behavior is expected and documented: The result is not guaranteed to be stable. When two keys are equal, the order of the corresponding elements is unpredictable.

If you want a stable sort, you could use the approach mentioned in https://8thlight.com/blog/will-warner/2013/03/26/stable-sorting-in-ruby.html

Updated by ryanflach (Ryan Flach) over 1 year ago

jeremyevans0 (Jeremy Evans) wrote:

The current behavior is expected and documented: The result is not guaranteed to be stable. When two keys are equal, the order of the corresponding elements is unpredictable.

If you want a stable sort, you could use the approach mentioned in https://8thlight.com/blog/will-warner/2013/03/26/stable-sorting-in-ruby.html

Jeremy,

Thanks for the response. You are definitely correct regarding its documentation. What I was attempting to point out was the apparent predictability in which it is not consistent, in that the first and last elements are swapped, which appears to be reproducible up to a fairly large number of elements in the array (but never at less than 8):

[1] pry(main)> def sorted(start_number, end_number)
[1] pry(main)*   [{a: 1, b: start_number}].tap do |arr|
[1] pry(main)*     ((start_number + 1)...end_number).each do |n|
[1] pry(main)*       arr << {a: 1, b: n}
[1] pry(main)*     end
[1] pry(main)*     arr << {a: 1, b: end_number}
[1] pry(main)*   end.sort_by { |h| h[:a] }
[1] pry(main)* end
=> :sorted
[2] pry(main)> arr = sorted(1, 10_000)
=> [{:a=>1, :b=>10000},
 {:a=>1, :b=>2},
 {:a=>1, :b=>3},
 {:a=>1, :b=>4},
 {:a=>1, :b=>5},
 {:a=>1, :b=>6},
 {:a=>1, :b=>7},
 {:a=>1, :b=>8},
 {:a=>1, :b=>9},
 {:a=>1, :b=>10},
 {:a=>1, :b=>11},
 {:a=>1, :b=>12},
 {:a=>1, :b=>13},
 {:a=>1, :b=>14},
[3] pry(main)> arr.first
=> {:a=>1, :b=>10000}
[4] pry(main)> arr.last
=> {:a=>1, :b=>1}

Best,
Ryan

Updated by jeremyevans0 (Jeremy Evans) over 1 year ago

ryanflach (Ryan Flach) wrote:

Thanks for the response. You are definitely correct regarding its documentation. What I was attempting to point out was the apparent predictability in which it is not consistent

The sort isn't stable, so the order of elements when the sort_by result is equal is not specified. The "predictability in which it is not consistent" is simply due to an implementation detail. It is not a bug.

Your initial post stated I would expect the above to remain in order, when a sort has not actually taken place. That made it sound like you were expecting a stable sort and you considered the current behavior a bug. I apologize if I misunderstood and you were attempting to point out an implementation detail and not report a bug.

Updated by Hanmac (Hans Mackowiak) over 1 year ago

you can use #with_index there

data.sort_by.with_index { |h, i| [h[:a], i] }

this should return a stable result

Also available in: Atom PDF