Feature #15161
closedmaking gcd faster for 3x3
Description
With the goal of making Ruby as fast as possible for 3x3 I would like to propose
a faster implementation of the gcd
function. I use gcd
a lot in my
primes-utils
gem, and in cryptography and Number Theory problems.
The current implementation
https://ruby-doc.org/core-2.5.1/Integer.html#method-i-gcd
uses a recursive implementation.
I propose using the binary (Stein's) algorithm, which I believe has been proposed/discussed before.
https://en.wikipedia.org/wiki/Binary_GCD_algorithm
However, I recently raised an issues with Crystal's implementation (also recursive)
and suggested using Stein's algorithm, which they approved.
https://github.com/crystal-lang/crystal/issues/6683
However, the default (iterative) wikipedia implementation is nowhere near as fast as possible.
https://lemire.me/blog/2013/12/26/fastest-way-to-compute-the-greatest-common-divisor/
The author provides benchmarks of different implmentations (including recursive)
https://github.com/lemire/Code-used-on-Daniel-Lemire-s-blog/blob/master/2013/12/26/gcd.cpp
and the version below is the fastest, and also the simplest.
unsigned int gcdwikipedia2fastswap(unsigned int u, unsigned int v)
{
int shift;
if (u == 0) return v;
if (v == 0) return u;
shift = __builtin_ctz(u | v);
u >>= __builtin_ctz(u);
do {
v >>= __builtin_ctz(v);
if(u>v) std::swap(u,v);
v = v - u;
} while (v != 0);
return u << shift;
}
Thank you for considering this.
Updated by shevegen (Robert A. Heiler) over 6 years ago
I think a benchmark may be useful to the issue here.
I don't think the ruby team minds any speed improvement but what
is usually done is to verify any potential gain through benchmarks
(e. g. optcarrot for, I think, ruby 3x3 goals).
Updated by mame (Yusuke Endoh) over 6 years ago
- Status changed from Open to Closed
It already uses the variant of the binary (Stein's) algorithm.
https://github.com/ruby/ruby/blob/3abbaab1a7a97d18f481164c7dc48749b86d7f39/rational.c#L285-L307
See #13503.
Updated by mame (Yusuke Endoh) over 6 years ago
- Is duplicate of Bug #13503: Improve performance of some Time & Rational methods added