## Feature #14383

closed### Making prime_division in prime.rb Ruby 3 ready.

**Description**

I have been running old code in Ruby 2.5.0 (released 2017.12.25) to check for

speed and compatibility. I still see the codebase in `prime.rb`

hardly has

changed at all (except for replacing `Math.sqrt`

with `Integer.sqrt`

).

To achieve the Ruby 3 goal to make it at least three times faster than Ruby 2

there are three general areas where Ruby improvements can occur.

- increase the speed of its implementation at the machine level
- rewrite its existing codebase in a more efficient|faster manner
- use faster algorithms to implement routines and functions

I want to suggest how to address the later two ways to improve performance of

specifically the `prime_division`

method in the `prime.rb`

library.

I've raised and made suggestions to some of these issues here

ruby-issues forum and now hope to invigorate additional discussion.

Hopefully with the release of 2.5.0, and Ruby 3 conceptually closer to reality,

more consideration will be given to coding and algorithmic improvements to

increase its performance too.

**Mathematical correctness**

First I'd like to raise what I consider *math bugs* in `prime_division`

, in how

it handles `0`

and `-1`

inputs.

```
> -1.prime_division
=> [[-1,1]]
> 0.prime_division
Traceback (most recent call last):
4: from /home/jzakiya/.rvm/rubies/ruby-2.5.0/bin/irb:11:in `<main>'
3: from (irb):85
2: from /home/jzakiya/.rvm/rubies/ruby-2.5.0/lib/ruby/2.5.0/prime.rb:30:in `prime_division'
1: from /home/jzakiya/.rvm/rubies/ruby-2.5.0/lib/ruby/2.5.0/prime.rb:203:in `prime_division'
ZeroDivisionError (ZeroDivisionError)
```

First, `0`

is a perfectly respectable integer, and is non-prime, so its output should be `[]`

,

an empty array to denote it has no prime factors. The existing behavior is solely a matter of

`prime_division`

's' implementation, and does not take this mathematical reality into account.

The output for `-1`

is also mathematically wrong because `1`

is also non-prime (and correctly

returns `[]`

), well then mathematically so should `-1`

. Thus, `prime_division`

treats `-1`

as

a new prime number, and factorization, that has no mathematical basis. Thus, for mathematical

correctness and consistency `-1`

and `0`

should both return `[]`

, as none have prime factors.

```
> -1.prime_division
=> []
> 0.prime_division
=> []
> 1.prime_division
=> []
```

There's a very simple one-line fix to `prime_division`

to do this:

```
# prime.rb
class Prime
def prime_division(value, generator = Prime::Generator23.new)
-- raise ZeroDivisionError if value == 0
++ return [] if (value.abs | 1) == 1
```

**Simple Code and Algorithmic Improvements**

As stated above, besides the machine implementation improvements, the other

areas of performance improvements will come from coding rewrites and better

algorithms. Below is the coding of `prime_division`

. This coding has existed at

least since Ruby 2.0 (the farthest I've gone back).

```
# prime.rb
class Integer
# Returns the factorization of +self+.
#
# See Prime#prime_division for more details.
def prime_division(generator = Prime::Generator23.new)
Prime.prime_division(self, generator)
end
end
class Prime
def prime_division(value, generator = Prime::Generator23.new)
raise ZeroDivisionError if value == 0
if value < 0
value = -value
pv = [[-1, 1]]
else
pv = []
end
generator.each do |prime|
count = 0
while (value1, mod = value.divmod(prime)
mod) == 0
value = value1
count += 1
end
if count != 0
pv.push [prime, count]
end
break if value1 <= prime
end
if value > 1
pv.push [value, 1]
end
pv
end
end
```

This can be rewritten in more modern and idiomatic Ruby, to become much shorter

and easier to understand.

```
require 'prime.rb'
class Integer
def prime_division1(generator = Prime::Generator23.new)
Prime.prime_division1(self, generator)
end
end
class Prime
def prime_division1(value, generator = Prime::Generator23.new)
# raise ZeroDivisionError if value == 0
return [] if (value.abs | 1) == 1
pv = value < 0 ? [[-1, 1]] : []
value = value.abs
generator.each do |prime|
count = 0
while (value1, mod = value.divmod(prime); mod) == 0
value = value1
count += 1
end
pv.push [prime, count] unless count == 0
break if prime > value1
end
pv.push [value, 1] if value > 1
pv
end
end
```

By merely rewriting it we get smaller|concise code, that's easier to understand,

which is slightly faster. A *triple win!* Just paste the above code into a 2.5.0

terminal session, and run the benchmarks below.

```
def tm; s=Time.now; yield; Time.now-s end
n = 500_000_000_000_000_000_008_244_213; tm{ pp n.prime_division }
[[3623, 1], [61283, 1], [352117631, 1], [6395490847, 1]]
=> 27.02951016
n = 500_000_000_000_000_000_008_244_213; tm{ pp n.prime_division1 }
[[3623, 1], [61283, 1], [352117631, 1], [6395490847, 1]]
=> 25.959149721
```

Again, we get a *triple win* to this old codebase by merely rewriting it. It can

be made 3x faster by leveraging the `prime?`

method from the `OpenSSL`

library to

perform a more efficient|faster factoring algorithm, and implementation.

```
require 'prime.rb'
require 'openssl'
class Integer
def prime_division2(generator = Prime::Generator23.new)
return [] if (self.abs | 1) == 1
pv = self < 0 ? [-1] : []
value = self.abs
prime = generator.next
until value.to_bn.prime? or value == 1
while prime
(pv << prime; value /= prime; break) if value % prime == 0
prime = generator.next
end
end
pv << value if value > 1
pv.group_by {|prm| prm }.map{|prm, exp| [prm, exp.size] }
end
end
```

Here we're making much better use of Ruby idioms and libraries (`enumerable`

and

`openssl`

), leading to a much greater performance increase. A bigger *triple win*.

Pasting this code into a 2.5.0 terminal session gives the following results.

```
# Hardware: System76 laptop; I7 cpu @ 3.5GHz, 64-bit Linux
def tm; s=Time.now; yield; Time.now-s end
n = 500_000_000_000_000_000_008_244_213; tm{ pp n.prime_division }
[[3623, 1], [61283, 1], [352117631, 1], [6395490847, 1]]
=> 27.02951016
n = 500_000_000_000_000_000_008_244_213; tm{ pp n.prime_division1 }
[[3623, 1], [61283, 1], [352117631, 1], [6395490847, 1]]
=> 25.959149721
n = 500_000_000_000_000_000_008_244_213; tm{ pp n.prime_division2 }
[[3623, 1], [61283, 1], [352117631, 1], [6395490847, 1]]
=> 9.39650374
```

`prime_division2`

is much more usable for significantly larger numbers and use

cases than `prime_division`

. I can even do multiple times better than this, if

you review the above cited forum thread.

My emphasis here is to show there are a lot of possible *low hanging fruit*

performance gains ripe for the picking to achieve Ruby 3 performance goals, if we

look (at minimum) for simpler|better code rewrites, and then algorithmic upgrades.

So the question is, are the devs willing to upgrade the codebase to provide the

demonstrated performance increases shown here for `prime_division`

?

**Files**