Feature #6256: Slightly improve ruby_qsort performance - Ruby - Ruby Issue Tracking System

Actions

Copy link

Feature #6256

closed

Slightly improve ruby_qsort performance

Added by MartinBosslet (Martin Bosslet) about 13 years ago. Updated over 6 years ago.

Status:

Feedback

Assignee:

MartinBosslet (Martin Bosslet)

Target version:

[ruby-core:44136]

Description

Hi all,

I think I may have found a way to slightly improve the performance of ruby_qsort.
Quicksort running time is slightly decreased if the recursion depth (or in our
case, rather iteration depth) is bounded by leaving sub problems larger than or
equal to some cutoff bound k untouched and running Insertion Sort on these small
sub problems to finalize the sorting.

I experimented with this, but to no avail, only marginal improvements if any. Then
I remembered that instead of running Insertion Sort on each sub problem, it is
equivalent in terms of running time to run one single Insertion Sort on the whole
nearly sorted array as a final step. And in practice, this turns out to run faster
than without the optimization. In my tests, execution time dropped to about 95% on
average with an optimal cutoff (64-bit Fedora 15) [1].

Now this ain't the world - but it is faster, and I could very well imagine that there
is still room for improving my code. In my tests, the optimal cutoff seems to be ~13
for Integers and ~8 for Strings and Symbols. I imagine the more costly the comparisons,
the lower will be the optimal cutoff. I've tested only on 64 Bit yet, but I will do so
for 32 Bit, too.

If it turns out that this runs faster regardless of the architecture in use, with an
optimal cutoff yet to be determined, do you think this would be a useful addition?

I have attached a C extension for testing and discussing, it's mostly a one-to-one copy of
the code in util.c. I just added mmassign and insertion_sort plus the few lines that respect
the cutoff in rqsort (had to rename it, otherwise it would collide with the real "ruby_qsort").

-Martin

[1] https://github.com/emboss/hybridsort/blob/master/hybrid-sort-results

Files

Download all files

hybridsort.tar.gz (3.93 KB) hybridsort.tar.gz		MartinBosslet (Martin Bosslet), 04/05/2012 10:02 AM
rubyqisort.patch (5.04 KB) rubyqisort.patch		MartinBosslet (Martin Bosslet), 04/07/2012 07:36 AM
rubyqisort4.patch (3.66 KB) rubyqisort4.patch		MartinBosslet (Martin Bosslet), 04/07/2012 10:35 AM
rubyqisort5.patch (3.69 KB) rubyqisort5.patch		MartinBosslet (Martin Bosslet), 04/07/2012 11:01 AM

Actions

Copy link

#1 [ruby-core:44144]

Updated by duerst (Martin Dürst) about 13 years ago

I haven't looked at the sources, but this exact optimization is very well known in the literature, and I'm showing it to my students in a lecture on algorithms and data structures every year (see e.g. http://www.sw.it.aoyama.ac.jp/2011/DA/programs/6qsort.rb, sorry, comments are in Japanese). I'm quite a bit surprised that this optimization isn't yet made in Ruby. The only potential problem is that the final insertion sort may mask errors in the quicksort, but as Ruby's current quicksort is working well, that should not be a problem. And a cutoff of about 10 is what's generally claimed to be a good value. The speed gains/losses when shifting the cutoff value by just a little bit shouldn't be very big.

So I'm definitely encouraging adoption of this improvement.

Actions

Copy link

#2 [ruby-core:44146]

Updated by mame (Yusuke Endoh) about 13 years ago

Status changed from Open to Assigned
Assignee set to MartinBosslet (Martin Bosslet)

Sound good. I agree, too.

Please create a patch and show benchmark.

--
Yusuke Endoh mame@tsg.ne.jp

Actions

Copy link

#3 [ruby-core:44169]

Updated by MartinBosslet (Martin Bosslet) about 13 years ago

File rubyqisort.patch rubyqisort.patch added

I added a patch, together with benchmarks.

Here's the results on my machine without the optimization:

$ ruby driver.rb -e ruby -p bm_sort -r 10¶

benchmark results:
name ruby 2.0.0dev (2012-04-05 trunk 35241) [x86_64-linux]
sort_int 0.378
sort_int_custom 0.327
sort_int_rev 0.111
sort_int_sorted 0.100
sort_str_rnd 0.107

and here with the patch applied:

$ ruby driver.rb -e ruby -p bm_sort -r 10¶

benchmark results:
name ruby 2.0.0dev (2012-04-05 trunk 35241) [x86_64-linux]
sort_int 0.367
sort_int_custom 0.332
sort_int_rev 0.229
sort_int_sorted 0.218
sort_str_rnd 0.106

From running them several times, my impression is that it's definetely
faster for sorting Fixnums, but there's no real improvement as soon
as we compare Strings or use a custom comparison, as I assumed before,
the costlier the comparison becomes the less advantage we get, at least
that's how it seemed in my tests.

Another fact is that we definitely lose in the "already ordered" cases.
The existing Quicksort does a linear check for "ordered regions" already,
to prevent worst case behavior. I actually had to take those out in
order to get the improvement with Insertion Sort. If I put them back in,
then the whole advantage is gone, and the "optimized" version runs
slightly slower than the old version.

Applying just one final Insertion Sort instead of sorting each sub
problem below the cut off with it also turned out to be slightly
slower unlike in my test extension.

I'm confused to see only minimal improvements, when implementing a
"textbook" Quicksort with the same optimization, the benefits are much
more obvious.

I assume there's more optimization in Ruby's current algorithm
that would no longer be needed with additional Insertion Sort,
maybe you have some ideas? Or maybe there's also room to
improve my Insertion Sort?

This is as fast as I could get it right now, at least in theory it
should be possible to improve it.

Actions

Copy link

#4 [ruby-core:44173]

Updated by MartinBosslet (Martin Bosslet) about 13 years ago

File rubyqisort4.patch rubyqisort4.patch added

I could improve the Insertion Sort a bit by using memcpy/memmove.
It runs a bit faster even with the "worst case detection" of
Quicksort turned on, things look a little brighter now:

$ ruby driver.rb -p bm_sort -r 10

benchmark results:
name ruby 2.0.0dev (2012-04-05 trunk 35241) [x86_64-linux]
sort_int 0.343
sort_int_custom 0.296
sort_int_rev 0.101
sort_int_sorted 0.090
sort_str_rnd 0.099

I also checked whether Binary Insertion Sort would bring
additional improvement, it did not.

This version performs slightly better now on all fronts
and the patch needs a lot less additional code.

Actions

Copy link

#5 [ruby-core:44175]

Updated by MartinBosslet (Martin Bosslet) about 13 years ago

File rubyqisort5.patch rubyqisort5.patch added

Updated patch - I forgot that chklim may be set to 0,
so I separated chklim and cutoff.

Actions

Copy link

#6 [ruby-core:48373]

Updated by ko1 (Koichi Sasada) over 12 years ago

ping. status?

Actions

Copy link

#7 [ruby-core:49790]

Updated by MartinBosslet (Martin Bosslet) over 12 years ago

Target version changed from 2.0.0 to 2.6

It's a tiny bit faster. But it should be noticeably faster. I need to get into the details of the current qsort implementation and do some tests/calculations in order to really know where to optimize appropriately. OK if I bump to "next minor"? I'd prefer to do some more analysis.

Actions

Copy link

Updated by naruse (Yui NARUSE) over 7 years ago

Target version deleted (~~2.6~~)

Actions

Copy link

#9 [ruby-core:90995]

Updated by mame (Yusuke Endoh) over 6 years ago

Status changed from Assigned to Feedback

A volunteer is welcome: update the patch for trunk, test, perform re-experiment, etc.

Actions

Copy link

Also available in: Atom PDF

Like0

Like0Like0Like0Like0Like0Like0Like0Like0Like0

Project

General

Profile

Ruby

Tags

Custom queries

Feature #6256

Slightly improve ruby_qsort performance

Updated by duerst (Martin Dürst) about 13 years ago

Updated by mame (Yusuke Endoh) about 13 years ago

Updated by MartinBosslet (Martin Bosslet) about 13 years ago

$ ruby driver.rb -e ruby -p bm_sort -r 10¶

$ ruby driver.rb -e ruby -p bm_sort -r 10¶

Updated by MartinBosslet (Martin Bosslet) about 13 years ago

Updated by MartinBosslet (Martin Bosslet) about 13 years ago

Updated by ko1 (Koichi Sasada) over 12 years ago

Updated by MartinBosslet (Martin Bosslet) over 12 years ago

Updated by naruse (Yui NARUSE) over 7 years ago

Updated by mame (Yusuke Endoh) over 6 years ago