## Feature #13219

closed### bug in Math.sqrt(n).to_i, to compute integer squareroot, new word to accurately fix it

**Description**

In doing a math application using **Math.sqrt(n).to_i** to compute integer squareroots

of integers I started noticing errors for numbers > 10**28.

I coded an algorithm that accurately computes the integer squareroot for arbirary sized numbers

but its significantly slower than the math library written in C.

Thus, I request a new method **Math.intsqrt(n)** be created, that is coded in C and part of the

core library, that will compute the integer squareroots of integers accurately and fast.

Here is working highlevel code to accurately compute the integer squareroot.

```
def intsqrt(n)
bits_shift = (n.to_s(2).size)/2 + 1
bitn_mask = root = 1 << bits_shift
while true
root ^= bitn_mask if (root * root) > n
bitn_mask >>= 1
return root if bitn_mask == 0
root |= bitn_mask
end
end
def intsqrt1(n)
return n if n | 1 == 1 # if n is 0 or 1
bits_shift = (Math.log2(n).ceil)/2 + 1
bitn_mask = root = 1 << bits_shift
while true
root ^= bitn_mask if (root * root) > n
bitn_mask >>= 1
return root if bitn_mask == 0
root |= bitn_mask
end
end
require "benchmark/ips"
Benchmark.ips do |x|
n = 10**40
puts "integer squareroot tests for n = #{n}"
x.report("intsqrt(n)" ) { intsqrt(n) }
x.report("intsqrt1(n)" ) { intsqrt1(n) }
x.report("Math.sqrt(n).to_i") { Math.sqrt(n).to_i }
x.compare!
end
```

Here's why it needs to be done in C.

```
2.4.0 :178 > load 'intsqrttest.rb'
integer squareroot tests for n = 10000000000000000000000000000000000000000
Warming up --------------------------------------
intsqrt(n) 5.318k i/100ms
intsqrt1(n) 5.445k i/100ms
Math.sqrt(n).to_i 268.281k i/100ms
Calculating -------------------------------------
intsqrt(n) 54.219k (± 5.5%) i/s - 271.218k in 5.017552s
intsqrt1(n) 55.872k (± 5.4%) i/s - 283.140k in 5.082953s
Math.sqrt(n).to_i 5.278M (± 6.1%) i/s - 26.560M in 5.050707s
Comparison:
Math.sqrt(n).to_i: 5278477.8 i/s
intsqrt1(n): 55871.7 i/s - 94.47x slower
intsqrt(n): 54219.4 i/s - 97.35x slower
=> true
2.4.0 :179 >
```

Here are examples of math errors using **Math.sqrt(n).to_i** run on Ruby-2.4.0.

```
2.4.0 :101 > n = 10**27; puts n, (a = intsqrt(n)), a*a, (b = intsqrt1(n)), b*b, (c = Math.sqrt(n).to_i), c*c
1000000000000000000000000000
31622776601683
999999999999949826038432489
31622776601683
999999999999949826038432489
31622776601683
999999999999949826038432489
=> nil
2.4.0 :102 > n = 10**28; puts n, (a = intsqrt(n)), a*a, (b = intsqrt1(n)), b*b, (c = Math.sqrt(n).to_i), c*c
10000000000000000000000000000
100000000000000
10000000000000000000000000000
100000000000000
10000000000000000000000000000
100000000000000
10000000000000000000000000000
=> nil
2.4.0 :103 > n = 10**29; puts n, (a = intsqrt(n)), a*a, (b = intsqrt1(n)), b*b, (c = Math.sqrt(n).to_i), c*c
100000000000000000000000000000
316227766016837
99999999999999409792567484569
316227766016837
99999999999999409792567484569
316227766016837
99999999999999409792567484569
=> nil
2.4.0 :104 > n = 10**30; puts n, (a = intsqrt(n)), a*a, (b = intsqrt1(n)), b*b, (c = Math.sqrt(n).to_i), c*c
1000000000000000000000000000000
1000000000000000
1000000000000000000000000000000
1000000000000000
1000000000000000000000000000000
1000000000000000
1000000000000000000000000000000
=> nil
2.4.0 :105 > n = 10**31; puts n, (a = intsqrt(n)), a*a, (b = intsqrt1(n)), b*b, (c = Math.sqrt(n).to_i), c*c
10000000000000000000000000000000
3162277660168379
9999999999999997900254631487641
3162277660168379
9999999999999997900254631487641
3162277660168379
9999999999999997900254631487641
=> nil
2.4.0 :106 > n = 10**32; puts n, (a = intsqrt(n)), a*a, (b = intsqrt1(n)), b*b, (c = Math.sqrt(n).to_i), c*c
100000000000000000000000000000000
10000000000000000
100000000000000000000000000000000
10000000000000000
100000000000000000000000000000000
10000000000000000
100000000000000000000000000000000
=> nil
2.4.0 :107 > n = 10**33; puts n, (a = intsqrt(n)), a*a, (b = intsqrt1(n)), b*b, (c = Math.sqrt(n).to_i), c*c
1000000000000000000000000000000000
31622776601683793
999999999999999979762122758866849
31622776601683793
999999999999999979762122758866849
31622776601683792
999999999999999916516569555499264
=> nil
2.4.0 :108 > n = 10**34; puts n, (a = intsqrt(n)), a*a, (b = intsqrt1(n)), b*b, (c = Math.sqrt(n).to_i), c*c
10000000000000000000000000000000000
100000000000000000
10000000000000000000000000000000000
100000000000000000
10000000000000000000000000000000000
100000000000000000
10000000000000000000000000000000000
=> nil
2.4.0 :109 > n = 10**35; puts n, (a = intsqrt(n)), a*a, (b = intsqrt1(n)), b*b, (c = Math.sqrt(n).to_i), c*c
100000000000000000000000000000000000
316227766016837933
99999999999999999873578871987712489
316227766016837933
99999999999999999873578871987712489
316227766016837952
100000000000000011890233980627554304
=> nil
2.4.0 :110 > n = 10**36; puts n, (a = intsqrt(n)), a*a, (b = intsqrt1(n)), b*b, (c = Math.sqrt(n).to_i), c*c
1000000000000000000000000000000000000
1000000000000000000
1000000000000000000000000000000000000
1000000000000000000
1000000000000000000000000000000000000
1000000000000000000
1000000000000000000000000000000000000
=> nil
2.4.0 :111 > n = 10**37; puts n, (a = intsqrt(n)), a*a, (b = intsqrt1(n)), b*b, (c = Math.sqrt(n).to_i), c*c
10000000000000000000000000000000000000
3162277660168379331
9999999999999999993682442519108007561
3162277660168379331
9999999999999999993682442519108007561
3162277660168379392
10000000000000000379480317059650289664
=> nil
2.4.0 :112 >
```

#### Updated by jzakiya (Jabari Zakiya) over 7 years ago

On my 64-bit Linux OS laptop with an I7 cpu, I tested that ** Math.sqrt(n).to_i** gives correct

answers from

`(0..9_999_899_999_899_999_322_536_673_279)`

. This was also the same on JRuby-9.1.7.0.Thus, I can create a hybrid method to maximize speed and accuracy like this.

```
def sqrt_i(n)
if n <= 9_999_899_999_899_999_322_536_673_279
Math.sqrt(n).to_i
else
intsqrt1(n)
end
end
def intsqrt1(n)
return n if n | 1 == 1
bits_shift = (Math.log2(n).ceil)/2 + 1
bitn_mask = root = 1 << bits_shift
while true
root ^= bitn_mask if (root * root) > n
bitn_mask >>= 1
return root if bitn_mask == 0
root |= bitn_mask
end
end
```

Test results below.

```
2.4.0 :415 > n = 9999899999899999322536673279; puts n, (a = intsqrt(n)), a*a, (b = intsqrt1(n)), b*b, (c = sqrt_i(n)), c*c, (d = Math.sqrt(n).to_i), d*d
9999899999899999322536673279
99999499998249
9999899999899801751003066001
99999499998249
9999899999899801751003066001
99999499998249
9999899999899801751003066001
99999499998249
9999899999899801751003066001
=> nil
2.4.0 :416 > n = 9999899999899999322536673279 - 1; puts n, (a = intsqrt(n)), a*a, (b = intsqrt1(n)), b*b, (c = sqrt_i(n)), c*c, (d = Math.sqrt(n).to_i), d*d
9999899999899999322536673278
99999499998249
9999899999899801751003066001
99999499998249
9999899999899801751003066001
99999499998249
9999899999899801751003066001
99999499998249
9999899999899801751003066001
=> nil
2.4.0 :417 > n = 9999899999899999322536673279 + 1; puts n, (a = intsqrt(n)), a*a, (b = intsqrt1(n)), b*b, (c = sqrt_i(n)), c*c, (d = Math.sqrt(n).to_i), d*d
9999899999899999322536673280
99999499998249
9999899999899801751003066001
99999499998249
9999899999899801751003066001
99999499998249
9999899999899801751003066001
99999499998250
9999899999900001750003062500
=> nil
```

#### Updated by jzakiya (Jabari Zakiya) over 7 years ago

Since this method is designed to operate on integers it would be nice to make it

a ** class Integer** method, then you can do this:

`4829391.sqrt_i`

(or whatever name it's given), which allows you to easily do method chaining.

#### Updated by jzakiya (Jabari Zakiya) over 7 years ago

This is the C version of the algorithm.

I put the part before the start of the **while** loop of the ruby

version to make it applicable for arbitrary size integers.

http://www.codecodex.com/wiki/Calculate_an_integer_square_root

```
unsigned int sqrt32(unsigned long n)
{
unsigned int c = 0x8000;
unsigned int g = 0x8000;
for (;;) {
if (g*g > n) {
g ^= c;
}
c >>= 1;
if (c == 0) {
return g;
}
g |= c;
}
}
```

#### Updated by jzakiya (Jabari Zakiya) over 7 years ago

I just realized ** class Integer** has a

**instance_method.**

`bit_length`

It makes the code a bit simpler, easier to understand, and faster.

```
def intsqrt(n)
bits_shift = (n.to_s(2).size)/2 + 1
bitn_mask = root = 1 << bits_shift
while true
root ^= bitn_mask if (root * root) > n
bitn_mask >>= 1
return root if bitn_mask == 0
root |= bitn_mask
end
end
def intsqrt1(n)
return n if n | 1 == 1
bits_shift = (Math.log2(n).ceil)/2 + 1
bitn_mask = root = 1 << bits_shift
while true
root ^= bitn_mask if (root * root) > n
bitn_mask >>= 1
return root if bitn_mask == 0
root |= bitn_mask
end
end
def intsqrt2(n)
bits_shift = (n.bit_length)/2 + 1
bitn_mask = root = 1 << bits_shift
while true
root ^= bitn_mask if (root * root) > n
bitn_mask >>= 1
return root if bitn_mask == 0
root |= bitn_mask
end
end
require "benchmark/ips"
Benchmark.ips do |x|
n = 10**40
puts "integer squareroot tests for n = #{n}"
x.report("intsqrt(n)" ) { intsqrt(n) }
x.report("intsqrt1(n)" ) { intsqrt1(n) }
x.report("intsqrt2(n)" ) { intsqrt2(n) }
x.report("Math.sqrt(n).to_i") { Math.sqrt(n).to_i }
x.compare!
end
```

It's a bit faster.

```
2.4.0 :299 > load 'intsqrttest.rb'
integer squareroot tests for n = 10000000000000000000000000000000000000000
Warming up --------------------------------------
intsqrt(n) 5.495k i/100ms
intsqrt1(n) 5.679k i/100ms
intsqrt2(n) 5.759k i/100ms
Math.sqrt(n).to_i 261.831k i/100ms
Calculating -------------------------------------
intsqrt(n) 56.563k (± 5.9%) i/s - 285.740k in 5.069484s
intsqrt1(n) 58.567k (± 6.1%) i/s - 295.308k in 5.062216s
intsqrt2(n) 59.769k (± 6.1%) i/s - 299.468k in 5.029980s
Math.sqrt(n).to_i 5.377M (± 6.7%) i/s - 26.969M in 5.039343s
Comparison:
Math.sqrt(n).to_i: 5377357.4 i/s
intsqrt2(n): 59768.9 i/s - 89.97x slower
intsqrt1(n): 58566.6 i/s - 91.82x slower
intsqrt(n): 56563.4 i/s - 95.07x slower
=> true
```

#### Updated by jzakiya (Jabari Zakiya) over 7 years ago

This works well in my code.

Easier to code and read with accurate results.

```
class Integer
def sqrt_i
self <= MAX_RANGE ? Math.sqrt(self).to_i : intsqrt(self)
end
private
MAX_RANGE = 9_999_899_999_899_999_322_536_673_279
def intsqrt(n)
bits_shift = (n.bit_length)/2 + 1
bitn_mask = root = 1 << bits_shift
while true
root ^= bitn_mask if (root * root) > n
bitn_mask >>= 1
return root if bitn_mask == 0
root |= bitn_mask
end
end
end
```

```
2.4.0 :020 > n = 10**111; puts n, (a = n.sqrt_i), a*a, (b = Math.sqrt(n).to_i), b*b
1000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
31622776601683793319988935444327185337195551393252168268
999999999999999999999999999999999999999999999999999999963630737732541472910196882732895436793219183483386119824
31622776601683794595141492119969113562259413928748515328
1000000000000000080647728865639515055517417465817645783945119006952630720180836471433888096459369651964250947584
```

#### Updated by jzakiya (Jabari Zakiya) over 7 years ago

A new version ** intsqrt3** is one line shorter, and more intuitive (to me)

but there is a neglible difference in speed versus

**in Ruby.**

`intsqrt2`

Maybe it's faster in C?

```
def intsqrt(n)
bits_shift = (n.to_s(2).size)/2 + 1
bitn_mask = root = 1 << bits_shift
while true
root ^= bitn_mask if (root * root) > n
bitn_mask >>= 1
return root if bitn_mask == 0
root |= bitn_mask
end
end
def intsqrt1(n)
return n if n | 1 == 1
bits_shift = (Math.log2(n).ceil)/2 + 1
bitn_mask = root = 1 << bits_shift
while true
root ^= bitn_mask if (root * root) > n
bitn_mask >>= 1
return root if bitn_mask == 0
root |= bitn_mask
end
end
def intsqrt2(n)
bits_shift = (n.bit_length)/2 + 1
bitn_mask = root = 1 << bits_shift
while true
root ^= bitn_mask if (root * root) > n
bitn_mask >>= 1
return root if bitn_mask == 0
root |= bitn_mask
end
end
def intsqrt3(n)
bits_shift = (n.bit_length)/2 + 2
root, bitn_mask = 0, (1 << bits_shift)
until (bitn_mask >>= 1) == 0
root |= bitn_mask
root ^= bitn_mask if (root * root) > n
end
root
end
require "benchmark/ips"
Benchmark.ips do |x|
n = 10**40
puts "integer squareroot tests for n = #{n}"
x.report("intsqrt(n)" ) { intsqrt(n) }
x.report("intsqrt1(n)" ) { intsqrt1(n) }
x.report("intsqrt2(n)" ) { intsqrt2(n) }
x.report("intsqrt3(n)" ) { intsqrt3(n) }
# x.report("Math.sqrt(n).to_i") { Math.sqrt(n).to_i }
x.compare!
end
```

```
2.4.0 :391 > load 'intsqrttest.rb'
integer squareroot tests for n = 10000000000000000000000000000000000000000
Warming up --------------------------------------
intsqrt(n) 5.621k i/100ms
intsqrt1(n) 5.890k i/100ms
intsqrt2(n) 5.879k i/100ms
intsqrt3(n) 5.872k i/100ms
Calculating -------------------------------------
intsqrt(n) 58.255k (± 6.9%) i/s - 292.292k in 5.041702s
intsqrt1(n) 59.962k (± 6.8%) i/s - 300.390k in 5.033585s
intsqrt2(n) 60.283k (± 6.4%) i/s - 305.708k in 5.092434s
intsqrt3(n) 60.120k (± 6.7%) i/s - 305.344k in 5.102386s
Comparison:
intsqrt2(n): 60283.1 i/s
intsqrt3(n): 60119.7 i/s - same-ish: difference falls within error
intsqrt1(n): 59962.4 i/s - same-ish: difference falls within error
intsqrt(n): 58254.7 i/s - same-ish: difference falls within error
=> true
```

#### Updated by shevegen (Robert A. Heiler) over 7 years ago

I can not evaluate whether the above is correct or not (not saying that it is not,

I just simply do not know), but I believe that method names such as "`intsqrt2`

" or

"`intsqrt`

" may possibly not be an ideal choice, due to the name alone.

`Math.sqrt()`

is already a bit shortish, adding more abbreviations, in particular

something such as "int" as part of it, would be even more peculiar.

If it would be a bug, as behaviour, why should `Math.sqrt()`

itself not be able

to deal with it? Internally it could call any other method before it outputs

a result.

#### Updated by nobu (Nobuyoshi Nakada) over 7 years ago

Robert A. Heiler wrote:

If it would be a bug, as behaviour, why should

`Math.sqrt()`

itself not be able

to deal with it? Internally it could call any other method before it outputs

a result.

Expected results differ.

`Math.sqrt()`

is expected to return a `Float`

with much precision as possible.

While the proposed method will return an `Integer`

which does not exceed the real square root.

#### Updated by jzakiya (Jabari Zakiya) over 7 years ago

To be clear, I am not saying ** Math.sqrt** has a bug in it.

As stated, it produces a floating point result, which inherently has finite (not infinite) precision.

The "bug", or more accurately, the realization of the limitations of using ** Math.sqrt(n).to_i** to

produce correct results for integers past a certain size, led me to suggest creating a new method

which will produce correct results for arbitrary sized integers (i.e. the largest integer value for

`root`

such that `root*root <= n`

).This need emanated for me in numerical applications pertaining to generating very large prime numbers.

As I previously suggested, this method should most naturally be a ** class Integer** method, and after much

consideration I like the method name

**, versus**

`sqrt_i`

`intsqrt`

, `isqrt`

, etc. This makes its usage as -

`n.sqrt_i`

super simple, and idiomatic, and would give an error message if applied to a non-

`Integer`

number/object,to not confuse its use with

**, which inherently produces a**

`Math.sqrt(n)`

**float**value.

I think this name is more ruby idiomatic, because the name of the function is first, and the **x_i**

extension is consistent with other naming conventions to denote class Type, like ** to_i**,

**,**

`to_f`

**.**

`to_s`

The algorithm I provided (both C and Ruby) show the method is very simple to code, and would be very fast

if coded in C, using its native shift operators, etc. If I had any significant C chops I would try to do it myself. :-)

Also, providing a fast native implementation of this function makes Ruby more attractive for numerical applications,

especially for Number Theory and Cryptography, and also enhances the Ruby 3x3 goals of making it at least 2x faster

than Ruby 2.0.

#### Updated by jzakiya (Jabari Zakiya) over 7 years ago

After further testing I found the same errors when using ** Math.sqrt(n).to_i** with

large number when using

`(n**(1.0/2)).to_i`

. This reinforces to me the need to provide**(by whatever name) so users won't fall prey to this undocumented anomaly.**

`sqrt_i`

```
2.4.0 :129 > n= 10**35; puts n, a = intsqrt4(n), a*a, b = Math.sqrt(n).to_i, b*b, c = (n**(1.0/2)).to_i, c*c
100000000000000000000000000000000000
316227766016837933
99999999999999999873578871987712489
316227766016837952
100000000000000011890233980627554304
316227766016837952
100000000000000011890233980627554304
=> nil
2.4.0 :130 >
```

I also realized that the algorithm used to determine the integer squareroot can be easily

generalized to correctly generate integer nth roots for arbitrary size integers.

This capability will now allows Ruby to be used for many Number Theory and Cryptographic problems,

such as generating correct integer values for eliptic curves, real integer roots of polynomials,

public key, et al, encryption, for very large number sets.

Because of this, I am going to add this capability to my **roots** rubygem.

https://rubygems.org/gems/roots

It has two (2) methods ** root** and

**which generate all the n roots of an nth root for**

`roots`

all

**types (integers, floats, complex, rational).**

`Numeric`

In order to not clash with my proposed ** class Integer** method name of

**as a Ruby core method,**

`sqrt_i`

I will name two more

**methods**

`class Integer`

**and**

`iroot2`

**to be added to**

`irootn(n)`

**.**

`roots`

Since finding squareroots is much more frequently done than other roots I give it its own method name.

Below is the code for doing this (which I may refine).

These methods will return the correct nth integer real (not complex) root for positive integers, and

for odd roots of negative integers.

```
class Integer
def irootn(n)
return nil if self < 0 && n.even?
raise "root n is < 2 or not an Integer" unless n.is_a?(Integer) && n > 1
sign = self < 0 ? -1 : 1
num = self.abs
bits_shift = (num.bit_length)/n + 2
root, bitn_mask = 0, (1 << bits_shift)
until (bitn_mask >>= 1) == 0
root |= bitn_mask
root ^= bitn_mask if root**n > num
end
root*sign
end
def iroot2; irootn(2) end
end
```

As with generating integer squareroots, doing `(num**(1.0/n)).to_i`

produces incorrect answers for

integers past some maximun value, as shown below.

```
2.4.0 :129 > n = 10**37; puts n, a = n.irootn(3), a**3, b = (n**(1.0/3)).to_i, b**3
10000000000000000000000000000000000000
2154434690031
9999999999987694380850132072511299791
2154434690031
9999999999987694380850132072511299791
=> nil
2.4.0 :130 > n = 10**38; puts n, a = n.irootn(3), a**3, b = (n**(1.0/3)).to_i, b**3
100000000000000000000000000000000000000
4641588833612
99999999999949657815157877549034676928
4641588833612
99999999999949657815157877549034676928
=> nil
2.4.0 :131 > n = 10**39; puts n, a = n.irootn(3), a**3, b = (n**(1.0/3)).to_i, b**3
1000000000000000000000000000000000000000
10000000000000
1000000000000000000000000000000000000000
9999999999999
999999999999700000000000029999999999999
=> nil
2.4.0 :134 > n = 10**112; puts n, a = n.irootn(4), a**4, b = (n**(1.0/4)).to_i, b**4
10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
10000000000000000000000000000
10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
9999999999999999583119736832
9999999999999998332478947328000104273492291412559539763672807300805169073432752214389506884609933472640769458176
=> nil
2.4.0 :135 >
```

These methods are also preferable because they work correctly for negative integers, while the exponential root

method throws errors.

```
2.4.0 :150 > n = (10**112); puts n, a = n.irootn(3), a**3
10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
21544346900318837217592935665193504952
9999999999999999999999999999999999999173635536969138804543654476122863556031297471619838179917400017798906449408
=> nil
2.4.0 :151 > n = (10**112); puts n, b = (n**(1.0/3)).to_i, b**3
10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
21544346900318734919579678222552399872
9999999999999857552405189045342271937486394088257690663380520853991701500619736949767614488813401623055562702848
=> nil
2.4.0 :152 > n = -(10**112); puts n, a = n.irootn(3), a**3
-10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
-21544346900318837217592935665193504952
-9999999999999999999999999999999999999173635536969138804543654476122863556031297471619838179917400017798906449408
=> nil
2.4.0 :153 > n = -(10**112); puts n, b = (n**(1.0/3)).to_i, b**3
RangeError: can't convert 1.077217345015937e+37+1.865795172362055e+37i into Integer
from (irb):153:in `to_i'
from (irb):153
from /home/jzakiya/.rvm/rubies/ruby-2.4.0/bin/irb:11:in `<main>'
2.4.0 :154 >
```

Again, creating C coded core versions of these methods not only will be more efficient but faster.

#### Updated by stomar (Marcus Stollsteimer) over 7 years ago

To clarify further the limitations of using Float here:

```
n = 10**35
Math.sqrt(n).to_i # => 316227766016837952
Math.sqrt(n).next_float.to_i # => 316227766016838016
```

BigDecimal / BigMath should be able to produce the correct results, but possibly slower(?) than a specialized integer method:

```
require "bigdecimal"
n = 10**35
r = BigDecimal(n).sqrt(20).to_i # => 316227766016837933
r ** 2 # => 99999999999999999873578871987712489
(r+1) ** 2 # => 100000000000000000506034404021388356
```

*Addendum:*

These methods are also preferable because they work correctly for negative integers, while the exponential root method throws errors.

To be precise: "correct" in *your* sense (i.e. returning a real number); the "exponential root" itself *does* work correctly -- returning a complex root is just fine --, the exception is raised by trying to convert a complex number into an integer. You could easily avoid this specific problem by calculating -(|n|^(1/3)).

#### Updated by jzakiya (Jabari Zakiya) over 7 years ago

One if the really nice things about Ruby is its intent to make users/programmers happy,

and to adhere to the "principle of least surprises (pols)".

I don't think it was the intent of Matz to create math methods that give correct results

some of the time, but not tell users under what conditions they wouldn't give correct results.

Unless you had my use cases, it was probably never tested to cover giving correct results for them.

I have identified verified incorrect results that I have seen no documentation warning users of.

I have presented a verified method to always produce correct results, not only for the initial

case for integer squareroots for arbitrary size integers, but now also for any real nth integer

root for arbitrary sized integers.

As I single user, I have solved these inherent errors in using these native math functions for

my use cases. So I now know what I have to do so I will get correct answers in my applications.

I now will also include these methods in my **roots** gem so if others need/want them they are available.

I think it would be in the best interest of Ruby users for the language to provide them the methods

I have presented to create correct results for their use cases, which should be fairly easy to code

and include in the core language, and documented accordingly for users, versus messing with the

inherent floating point methods and figuring out how to coerce them into producing correct integer

results for every possible arbitrary sized integer, or forcing users to figure out how to do so.

#### Updated by jzakiya (Jabari Zakiya) over 7 years ago

```
class Integer
def irootn(n)
return nil if self < 0 && n.even?
raise "root n is < 2 or not an Integer" unless n.is_a?(Integer) && n > 1
num = self.abs
bits_shift = (num.bit_length)/n + 2
root, bitn_mask = 0, (1 << bits_shift)
until (bitn_mask >>= 1) == 0
root |= bitn_mask
root ^= bitn_mask if root**n > num
end
root *= (self < 0 ? -1 : 1)
end
def iroot2; irootn(2) end
end
require "bigdecimal"
require "benchmark/ips"
Benchmark.ips do |x|
n = 10**35
puts "integer squareroot tests for n = #{n}"
x.report("iroot2" ) { n.iroot2 }
x.report("irootn(2)" ) { n.irootn(2) }
x.report("BigDecimal(n).sqrt(5 ).to_i") { BigDecimal(n).sqrt(5 ).to_i }
x.report("BigDecimal(n).sqrt(10).to_i") { BigDecimal(n).sqrt(10).to_i }
x.report("BigDecimal(n).sqrt(20).to_i") { BigDecimal(n).sqrt(20).to_i }
x.compare!
end
```

Yes, its much slower, even to the highlevel Ruby versions.

```
2.4.0 :201 > load 'irootstest.rb'
integer squareroot tests for n = 100000000000000000000000000000000000
Warming up --------------------------------------
iroot2 5.681k i/100ms
irootn(2) 5.714k i/100ms
BigDecimal(n).sqrt(5 ).to_i
3.021k i/100ms
BigDecimal(n).sqrt(10).to_i
2.953k i/100ms
BigDecimal(n).sqrt(20).to_i
2.616k i/100ms
Calculating -------------------------------------
iroot2 57.825k (± 3.3%) i/s - 289.731k in 5.016021s
irootn(2) 57.462k (± 3.7%) i/s - 291.414k in 5.078940s
BigDecimal(n).sqrt(5 ).to_i
30.543k (± 2.8%) i/s - 154.071k in 5.048265s
BigDecimal(n).sqrt(10).to_i
30.709k (± 3.1%) i/s - 153.556k in 5.005239s
BigDecimal(n).sqrt(20).to_i
26.725k (± 3.0%) i/s - 136.032k in 5.094723s
Comparison:
iroot2: 57825.2 i/s
irootn(2): 57461.9 i/s - same-ish: difference falls within error
BigDecimal(n).sqrt(10).to_i: 30708.9 i/s - 1.88x slower
BigDecimal(n).sqrt(5 ).to_i: 30543.4 i/s - 1.89x slower
BigDecimal(n).sqrt(20).to_i: 26725.0 i/s - 2.16x slower
=> true
2.4.0 :202
```

And you need to know beforehand the needed correct precision to display the correct results.

```
2.4.0 :214 > n = 10**35; puts n, n.iroot2, BigDecimal(n).sqrt(5).to_i, BigDecimal(n).sqrt(10).to_i,BigDecimal(n).sqrt(20).to_i
100000000000000000000000000000000000
316227766016837933
316227766016837933
316227766016837933
316227766016837933
=> nil
2.4.0 :215 > n = 10**45; puts n, n.iroot2, BigDecimal(n).sqrt(5).to_i, BigDecimal(n).sqrt(10).to_i,BigDecimal(n).sqrt(20).to_i
1000000000000000000000000000000000000000000000
31622776601683793319988
31622776601683666666666
31622776601683666666666
31622776601683793319988
=> nil
2.4.0 :216 > n = 10**55; puts n, n.iroot2, BigDecimal(n).sqrt(5).to_i, BigDecimal(n).sqrt(10).to_i,BigDecimal(n).sqrt(20).to_i
10000000000000000000000000000000000000000000000000000000
3162277660168379331998893544
3162277660168379331499021527
3162277660168379331499021527
3162277660168379331998893544
=> nil
2.4.0 :217 > n = 10**65; puts n, n.iroot2, BigDecimal(n).sqrt(5).to_i, BigDecimal(n).sqrt(10).to_i,BigDecimal(n).sqrt(20).to_i
100000000000000000000000000000000000000000000000000000000000000000
316227766016837933199889354443271
316227766016837466536394723986322
316227766016837466536394723986322
316227766016837933199889000000000
=> nil
2.4.0 :218 >
```

#### Updated by stomar (Marcus Stollsteimer) over 7 years ago

@Jabari Note that my comments were not meant as arguments against your feature request, I only wanted to point out the reasons for the observed behavior. Personally, I always welcome improved math functionality; however, I'm not informed enough to know whether this should be in the stdlib or not.

There certainly is a problem with the docs for Math.sqrt, which is incomplete (return value types are not given), misleading (it promises to avoid rounding errors), and actually doesn't match the actual behavior:

```
$ ri Math::sqrt
(from ruby site)
------------------------------------------------------------------------------
sqrt(a)
------------------------------------------------------------------------------
Computes the square root of a. It makes use of Complex and Rational to
have no rounding errors if possible.
Math.sqrt(4/9) # => 2/3
Math.sqrt(- 4/9) # => Complex(0, 2/3)
Math.sqrt(4.0/9.0) # => 0.666666666666667
```

Based on this you would for instance expect the square root of an integer square number to be returned as an (exact) integer or rational; but actually it's a float:

```
Math.sqrt(9) # => 3.0 ("no rounding errors if possible" ?!?)
```

#### Updated by jzakiya (Jabari Zakiya) over 7 years ago

No, no, I didn't take it as a negative comment per se,

I just hadn't tried to use **BigDecimal** before, so I just went ahead and

created tests to see for myself how they performed (accuracy and speed).

Ruby has a much nicer community compared to some others, which is why

I've tried to document the problem, and a solution for it, in great detail.

I want to see Ruby become more used in mathematical and numerical applications

like Python is now becoming. Thus, it needs to be as accurate as possible, and

then, of course, fast (enough).

For users (new and old) using Ruby for advanced mathematical/numerical applications,

and the new rave of "big data" things, I don't want it developing a reputation of

producing erroneous results, especially compared to Python, et al, which maybe don't

produce these errors (and I have no idea how inherently inaccurate Python is).

Here is a section of the README.md file for my new version of my **roots** gem I'll be releasing.

```
## Description
Starting with Roots 2.0.0 (2017-2-20) the methods **iroot2** and **irootn** were added.
They are instance_methods of **class Integer** that will accurately compute the real value
squareroot and nth_root for arbitrary sized Integers.
If you have an application where you need the exact correct integer real value for roots,
especially for arbitrary sized (large) integers, use these methods instead or **root|s**.
These methods work strictly in the Integer domain, and do not first create floating point
approximations of these roots and then convert them to integers.
They are useful for applications in pure math, Number Theory, Cryptography, etc, where you
need to find the exact real integer roots of polynomials, encryption functions, etc.
The methods **root** and **roots** work for all the **Numeric** classes (integers, floats,
complex, rationals), and produce floating point results. They, thus, produce floating
point approximations to the exact **Integer** values for real roots for arbitrary sized integers.
Usage
**iroot2**
Use syntax: ival.iroot2
Return the largest Integer +root+ of Integer ival such that root**2 <= ival
A negative ival will result in 'nil' being returned.
9.iroot2 => 3
-9.iroot2 => nil
120.iroot2 => 10
121.iroot2 => 11
121.iroot2.class => Integer
(10**10).iroot2 => 100000
(10**10).root(2) => 100000.0
(10**10).roots(2)[0] = 100000.0+0.0i
(10**11).iroot2 => 316227
(10**11).root(2) => 316227.76601684
(10**11).roots(2)[0] = 316227.76601684+0.0i
(10**110).iroot2 => 10000000000000000000000000000000000000000000000000000000
(10**110).root(2) => 1.0e+55
(10**110).roots(2)[0] = 1.0e+55+0.0i
(10**111).iroot2 => 31622776601683793319988935444327185337195551393252168268
(10**111).root(2) => 3.1622776601683795e+55
(10**111).roots(2)[0] = 3.1622776601683795e+55+0.0i
```

#### Updated by jzakiya (Jabari Zakiya) over 7 years ago

Here's the README.md writeup on the general integer roots algorithm.

At the cpu level, all these bit operations -- >>, <<, |, ^ -- are one clock cycle instructions,

so you see this can be very fast if done in C. Maybe for arbitrary size numbers that can't

reside in a single cpu register they take longer, but they would still be faster than high level Ruby.

```
For integer numbers:
For binary based (digital) computers (cpu), all integers are represented as
binary numbers. To find the root "n" of an integer "num" we can use the following
process to find the largest integer nth "root" such that root**n <= num.
For an integer composed of 'b' bits an nth root 'n' will have at most (b/n + 1) bits.
For squareroots, n = 2:
For 9 = 0b1001, it has b = 4 bits, and its squareroot has at most (4/2 + 1) = 3 bits.
The squareroot of 9 is 3 = 0b11, which is 2 bits.
For 100 = 0b1100100, b = 7, and its squareroot has at most (7/2 + 1) = 4 bits.
The squareroot of 100 is 10 = 0b1010, which is 4 bits.
For cuberoots, n = 3:
For 8 = 0b1000, it has b = 4 bits, and its cuberoot has at most (4/3 + 1) = 2 bits.
The cuberoot of 8 is 2 = 0b10, which is 2 bits.
For 125 = 0b1111101, b = 7, and its cuberoot has at most (7/3 + 1) = 3 bits.
The cuberoot of 125 is 5 = 0b101, which is 3 bits.
Algorithm:
bits_shift = (num.bits.length)/n + 1 # determine max number of root bits
bitn_mask = 1 << bits_shift # set value for max bit position of root
root = 0 # initialize the value for root
until bitn_mask == 0 # step through all the bit positions for root
root |= bitn_mask # set current bit position to '1' in root
root ^= bitn_mask if root**n > num # set it back to '0' if root too large
bitn_mask >>= 1 # set bitn_mask to next smaller bit position
end
root # return exact integer value for root
```

#### Updated by akr (Akira Tanaka) over 7 years ago

Is there many application for this method?

Concrete examples may help to persuade matz.

Another important problem is the method name.

If we use same name for same functionality in other languages/libraries,

it makes learning this method easier.

What's the name of this function in other languages/libraries?

I found isqrt in Common Lisp and Tcl, mpz_sqrt in GMP.

Others?

#### Updated by jzakiya (Jabari Zakiya) over 7 years ago

I updated my **roots** rubygem to 2.0.0 to include **iroot2** and **irootn**.

https://rubygems.org/gems/roots

https://github.com/jzakiya/roots

#### Updated by Student (Nathan Zook) over 7 years ago

I just noticed this thread. One bit per cycle methods are not fast by modern methods, they are quite slow. I will attempt to have something more concrete within 24 hours.

Nathan Zook

#### Updated by Student (Nathan Zook) over 7 years ago

You might want to consider the following articles:

https://www.reddit.com/r/algorithms/comments/1zt63v/fast_algorithm_to_calculate_integer_square_root/

Which lead me to wikipedia:

https://en.wikipedia.org/wiki/Methods_of_computing_square_roots#Binary_numeral_system_.28base_2.29

(You might want to start a bit higher in the article to get the context.)

and also to Zimmerman's implementation:

https://gmplib.org/repo/gmp-5.1/file/c5010c039373/mpn/generic/sqrtrem.c

The wikipedia article implements a faster bit-by-bit computation. Zimmerman implements the standard solution, which converges quadratically.

Looking at your comments, however, you seem really to be interested in a high performance multiprecision library for ruby. Is there any particular reason that we should not take an existing C library and drop a wrapper around it?

#### Updated by Student (Nathan Zook) over 7 years ago

I am confused by your solution in comment #3. By your own report, there is no issue for numbers much larger than a single word. But this code is clearly appropriate only for single-word inputs.

#### Updated by matz (Yukihiro Matsumoto) over 7 years ago

It's OK to add the functionality. Is `Integer.sqrt(n)`

OK for you?

Matz.

#### Updated by nobu (Nobuyoshi Nakada) over 7 years ago

Yukihiro Matsumoto wrote:

It's OK to add the functionality. Is

`Integer.sqrt(n)`

OK for you?

When `n`

is not an `Integer`

(but a `Numeric`

), is it truncated first then `sqrt`

?

#### Updated by tompng (tomoya ishida) over 7 years ago

using Newton's method might be another good way to implement it.

```
def intsqrt_newton(n)
raise if n<0
return n if n<=1
r = 1<<((n.bit_length+1)/2)
# r*r >= n
while true # r*r-n >= 2*r
r2 = r-(r*r-n)/(2*r)
break if r2 == r
r=r2
end
# r*r-n < 2*r
# (r-1)**2 < n+1
# √n-1 < r-1 <= √n < √(n+1)
r-1
end
```

#### Updated by stomar (Marcus Stollsteimer) over 7 years ago

*Edit: please ignore (misunderstanding)*

~~I think ~~`Integer.sqrt`

or `Numeric.sqrt`

is extremely problematic. I guess everybody would expect it to be an alias for `Math::sqrt`

and behave like this:

```
# (probably) expected behavior
2.sqrt(2) # => 1.4142135623730951
10.sqrt(2) # => 3.1622776601683795
(6.25).sqrt(2) # => 2.5
# proposed behavior for this feature
10.sqrt(2) # => 3
# _not_ part of this request
(6.25).sqrt(2) # 2 ??? 2.5 ???
```

~~Moreover, ~~`sqrt`

(*square* root) implies `n = 2`

, so `sqrt(3)`

, `sqrt(4)`

etc. is contradictory.

#### Updated by stomar (Marcus Stollsteimer) over 7 years ago

For square roots, `isqrt`

seems to be an option with some(?) spread, see https://en.wikipedia.org/wiki/Integer_square_root. It would look like this:

```
26.isqrt # => 5
30.isqrt # => 5
```

For higher roots, maybe `iroot(n)`

could be an option:

```
26.iroot(3) # => 2
30.iroot(3) # => 3
```

~~(However, I'm not sure whether the request goes beyond square roots.)~~

#### Updated by akr (Akira Tanaka) over 7 years ago

Marcus Stollsteimer wrote:

I think

`Integer.sqrt`

or`Numeric.sqrt`

is extremely problematic. I guess everybody would expect it to be an alias for`Math::sqrt`

and behave like this:

No. Integer.sqrt is a class method.

```
Integer.sqrt(1000) #=> 31
```

It is clealy different from "Math.sqrt(n)" (or "sqrt(n)" after "include Math").

The benefit of Integer.sqrt is that it doesn't introduce a new word which expands

vocabulary of ruby.

If there is common name for this function, expanding the vocabulary is not a big problem, though.

Also, Integer.sqrt(n) is square root.

So, n-th root should be a different method:

Integer.root(m, n) for n-th root, for example.

#### Updated by matz (Yukihiro Matsumoto) over 7 years ago

Marcus, probably you are confusing `Integer.sqrt()`

and `Integer#sqrt()`

.

The former, you call `Integer.sqrt(4) # => 2`

and the latter, you call `4.sqrt # => 2`

.

We are talking about the former.

Matz.

#### Updated by Student (Nathan Zook) over 7 years ago

Nobuyoshi Nakada wrote:

Yukihiro Matsumoto wrote:

It's OK to add the functionality. Is

`Integer.sqrt(n)`

OK for you?When

`n`

is not an`Integer`

(but a`Numeric`

), is it truncated first then`sqrt`

?

The beauty of this particular problem is that it does not matter. ;)

#### Updated by jzakiya (Jabari Zakiya) over 7 years ago

Let me attempt to address some of the issues/questions that have recently been presented.

Excuse me if any of this redundant, at the time I am writing.

- Fastest algorithm

After extensive searching and testing, the fastest and most flexible algorithm

to exactly compute not only the integer squareroot, but also any root, is the

one I'll just name the "binary bit method" (**bbm**). See benchmarks below of code.

The methods **iroot2** and **irootn** are now in my **roots** gem and use **bbm**.

I encourage people to take this code and include any other proposed methods to

create a reference test suite for assessing performance (assuming the other methods

correctly compute the exact root values).

```
class Integer
def irootn(n)
return nil if self < 0 && n.even?
raise "root n is < 2 or not an Integer" unless n.is_a?(Integer) && n > 1
num = self.abs
bits_shift = (num.bit_length)/n + 2
root, bitn_mask = 0, (1 << bits_shift)
until (bitn_mask >>= 1) == 0
root |= bitn_mask
root ^= bitn_mask if root**n > num
end
root *= (self < 0 ? -1 : 1)
end
def iroot2; irootn(2) end
end
def intsqrt7(n) # Newton's method
x = n
y = (n + 1)/2
while y < x
x = y
y = (x + n/x)/2
end
x
end
# from http://www.codecodex.com/wiki/Calculate_an_integer_square_root
def intsqrt8(square)
# Check our input
square = square.to_i # force integer
return 0 if square == 0 # quick exit
raise RangeError if square < 0
# Actual computation
n = iter(1, square)
n1 = iter(n, square)
n1, n = iter(n1, square), n1 until n1 >= n - 1
n1 = n1 - 1 until n1*n1 <= square
return n1
end
def iter(n, square) (n+(square/n))/2 end
require "bigdecimal"
require "benchmark/ips"
Benchmark.ips do |x|
n = 10**40
puts "integer squareroot tests for n = #{n}"
x.report("iroot2" ) { n.iroot2 }
x.report("irootn(2)" ) { n.irootn(2) }
x.report("newtons" ) { intsqrt7(n) }
x.report("codecodex" ) { intsqrt8(n) }
x.compare!
end
```

All will correctly compute the exact integer squareroot, but bbm is clearly faster.

```
2.4.0 :024 > load 'irootstest.rb'
integer squareroot tests for n = 10000000000000000000000000000000000000000
Warming up --------------------------------------
iroot2 4.368k i/100ms
irootn(2) 4.375k i/100ms
newtons 3.491k i/100ms
codecodex 2.880k i/100ms
Calculating -------------------------------------
iroot2 43.969k (± 3.0%) i/s - 222.768k in 5.071034s
irootn(2) 43.990k (± 3.1%) i/s - 223.125k in 5.077264s
newtons 35.369k (± 2.9%) i/s - 178.041k in 5.038327s
codecodex 29.093k (± 3.0%) i/s - 146.880k in 5.053165s
Comparison:
irootn(2): 43990.3 i/s
iroot2: 43969.0 i/s - same-ish: difference falls within error
newtons: 35369.0 i/s - 1.24x slower
codecodex: 29093.5 i/s - 1.51x slower
=> true
```

- It is well understood in numerical analysis, Number Theory, cryptography, etc,

an integer nth_root n is the largest integer such that root*n <= num. It is not a

floating point value, nor just the truncation of a floating point value. A

calculator "root" is, by definition, a floating point value, and is an approximation

of some possible exact value, to the precision of the system to produce it.

Other languages have at least the equivalent of an **isqrt** function that work

strictly on integers, not floats, for the reasons given above.

- Usefulness in advanced mathematical and numerical analysis fields.

Most people/programmers probably first (and mostly) think of Ruby as a Rails (web)

language. Whereas Python is being promoted and used widely now in a number of

mathematical and numerical analysis based projects.

https://jupyter.org/

https://ipython.org/

https://github.com/jupyterhub/jupyterhub

http://www.sagemath.org/

https://en.wikipedia.org/wiki/SageMath

I think Ruby is way easier to use (in general) for these applications, but it doesn't

have anywhere near the breadth of libraries and focus from its user base to be applied

to them. So Python is being used, basically by default, because it's easier to program

in than C/C++, and being made fast enough to satisfy more users. I think as a focus for

Ruby 3x3 it can start to create more necessary primitive methods to give it more creed

to be applicable to these fields.

- roots as primitive (machine level) methods.

Implementing the **bbm** algorithm as a primitive method can be at least 1-3 orders

faster than the high level implementations shown in **irooot2** and **irootn**.

The following is why, with computing squareroot, but conceptually applies to all roots too.

For a cpu with n even bits (32, 64) an integer squareroot will only be composed of at most

n/2 bits. So a 64 bit cpu can retain in one register the squareroot of a 128 bit number,

i.e. a value of 2**128 - 1 = 340,282,366,920,938,463,463,374,607,431,768,211,455. For an

arbitrary sized integer it will take take (log2(num).ceil)/64 +1 cpu mem locations to

represent the number.

So at the primitive level, it would first determine how many register length chunks the

maximum root value would need. If it needed only one register (a num < 2**128), then all
the bbm process is super simple and fast. If the root value takes more than one register,
the primitive algorithm would start with the MSchunk, do the bbm, and switch in the lower
chunks until finished. For roots > 2, this becomes even faster because any nth_root n requires at
most (num.bit_length/n + 1) bit to represent it. So the cube (3) root can be fit in 64 bits for
any integer < 2**(64*3) = 2**192 = 62,77,101,735,386,680,763,835,789,423,207,666,416,102,355,444,464,034,512,896

No other algorithm has the flexibility and efficiency of the **bbm** because it requires no

additions, subtractions, or divisions, only a multiplication via an exponentiation. And that

can be done efficiently for large numbers with an appropriate algorithm (better than brute force).

All the other math (besides the > and == comparisons) involve in place bit operations -- `>>, <<, ^, |`

.

which at the cpu are one cycle (or better) operations.

I hope this has sufficiently helped to address these issues/questions.

#### Updated by stomar (Marcus Stollsteimer) over 7 years ago

Matz, akr:

sorry for the noise, it's of course completely clear. I only didn't read close enough and missed the switch from the proposed instance method and n-th root to a class method. Stupid.

Now it's also clear to me that you were only talking about square roots, not higher roots.

#### Updated by jzakiya (Jabari Zakiya) over 7 years ago

No, I am sorry for any confusion.

My initial concern was for accurately computing exact integer squareroots

for my immediate use case needs.

After finding the **bbm** algorithm for squareroots, and modifying it to

compute any nth-root, I realized this should be the primitive to create,

because the squareroot (n=2) is just a particular instance, which is the

most frequently used root.

#### Updated by Student (Nathan Zook) over 7 years ago

Jabari Zakiya wrote:

Let me attempt to address some of the issues/questions that have recently been presented.

Excuse me if any of this redundant, at the time I am writing.

- Fastest algorithm
After extensive searching and testing, the fastest and most flexible algorithm

to exactly compute not only the integer squareroot, but also any root, is the

one I'll just name the "binary bit method" (bbm). See benchmarks below of code.The methods

iroot2andirootnare now in myrootsgem and usebbm.I encourage people to take this code and include any other proposed methods to

create a reference test suite for assessing performance (assuming the other methods

correctly compute the exact root values).

First, I love the idea of giving ruby the ability to play in the MP space. Anything to allow this beautiful tool to be used more broadly must be a good thing. I very much support adding such support. (Yes, this is me volunteering.)

Having said this, to do so involves a number of issues. For instance, I just attempted to time Integer#prime? on a 4096-bit number. Got a floating point overflow. Same for 2048 bits. So, I tried 1024. It's still running. In fact, I expect my computer to have a mechanical breakdown before the routine returns true.

The articles which I linked included both a faster bit-by-bit algorithm, and mentioned Newton-Raphson-derived methods, which clearly converge with better O(). The given benchmark is of a 133-bit number. I would like to see timing numbers for 128, 512, 1024, 2048, and 4096-bit integers before suggesting that we are going there. Serious work looks a much larger integers, btw.

I'll see what I can whip up for benchmarks...

#### Updated by jzakiya (Jabari Zakiya) over 7 years ago

Hey that's great! I'd love to help work on this too, though

I'm not much of a C programmer. I can help with testing though.

Below are test results comparing the **bbm** algorithm used in

**iroot2** and **irootn** versus the general Newton's method.

As you can see, for the range of values used, Newton becomes

much slower as the numbers gets bigger.

First **bbm**, for squareroot, is order O(log2(n)/2)), which I

think beats Newton. But also compared to Newton, all those

additions and divisions done in the inner loop become more

expensive (cpu intensive) as the numbers get bigger (taking care

of carries, etc), whereas the **bbm** merely does inplace bit operations.

You obviously can make Newton's mechanically better, but I suspect

it will never be faster on a real computer than **bbm** because of

its cpu work cost to implement. In fact my I7 cpu laptop's fan started

to kick in running these benchmarks, which I let cool down between runs.

Both methods did create the same correct results.

Test suite:

```
class Integer
def irootn(n)
return nil if self < 0 && n.even?
raise "root n is < 2 or not an Integer" unless n.is_a?(Integer) && n > 1
num = self.abs
bits_shift = (num.bit_length)/n + 2
root, bitn_mask = 0, (1 << bits_shift)
until (bitn_mask >>= 1) == 0
root |= bitn_mask
root ^= bitn_mask if root**n > num
end
root *= (self < 0 ? -1 : 1)
end
def iroot2; irootn(2) end
end
def intsqrt7(n) # Newton's method
x = n
y = (n + 1)/2
while y < x
x = y
y = (x + n/x)/2
end
x
end
require "benchmark/ips"
Benchmark.ips do |x|
exp = 4000; n = 10**exp
puts "integer squareroot tests for n = 10**#{exp}"
x.report("iroot2" ) { n.iroot2 }
x.report("irootn(2)" ) { n.irootn(2) }
x.report("newtons" ) { intsqrt7(n) }
x.compare!
end
```

Benchmark results:

```
2.4.0 :042 > load 'irootstest.rb'
integer squareroot tests for n = 10**500
Warming up --------------------------------------
iroot2 70.000 i/100ms
irootn(2) 69.000 i/100ms
newtons 43.000 i/100ms
Calculating -------------------------------------
iroot2 636.901 (±14.9%) i/s - 3.150k in 5.089664s
irootn(2) 676.828 (± 5.6%) i/s - 3.381k in 5.012279s
newtons 427.424 (± 4.4%) i/s - 2.150k in 5.040686s
Comparison:
irootn(2): 676.8 i/s
iroot2: 636.9 i/s - same-ish: difference falls within error
newtons: 427.4 i/s - 1.58x slower
=> true
2.4.0 :043 > load 'irootstest.rb'
integer squareroot tests for n = 10**1000
Warming up --------------------------------------
iroot2 22.000 i/100ms
irootn(2) 22.000 i/100ms
newtons 14.000 i/100ms
Calculating -------------------------------------
iroot2 228.501 (± 3.9%) i/s - 1.144k in 5.014841s
irootn(2) 227.442 (± 4.8%) i/s - 1.144k in 5.042017s
newtons 139.466 (± 3.6%) i/s - 700.000 in 5.027213s
Comparison:
iroot2: 228.5 i/s
irootn(2): 227.4 i/s - same-ish: difference falls within error
newtons: 139.5 i/s - 1.64x slower
=> true
2.4.0 :044 > load 'irootstest.rb'
integer squareroot tests for n = 10**2000
Warming up --------------------------------------
iroot2 6.000 i/100ms
irootn(2) 6.000 i/100ms
newtons 3.000 i/100ms
Calculating -------------------------------------
iroot2 67.819 (± 2.9%) i/s - 342.000 in 5.049151s
irootn(2) 68.311 (± 2.9%) i/s - 342.000 in 5.013356s
newtons 38.863 (± 2.6%) i/s - 195.000 in 5.024956s
Comparison:
irootn(2): 68.3 i/s
iroot2: 67.8 i/s - same-ish: difference falls within error
newtons: 38.9 i/s - 1.76x slower
=> true
2.4.0 :045 > load 'irootstest.rb'
integer squareroot tests for n = 10**4000
Warming up --------------------------------------
iroot2 1.000 i/100ms
irootn(2) 1.000 i/100ms
newtons 1.000 i/100ms
Calculating -------------------------------------
iroot2 17.380 (± 5.8%) i/s - 87.000 in 5.011300s
irootn(2) 17.417 (± 5.7%) i/s - 87.000 in 5.000415s
newtons 9.512 (± 0.0%) i/s - 48.000 in 5.050706s
Comparison:
irootn(2): 17.4 i/s
iroot2: 17.4 i/s - same-ish: difference falls within error
newtons: 9.5 i/s - 1.83x slower
=> true
2.4.0 :046 >
```

#### Updated by Student (Nathan Zook) over 7 years ago

```
def isqrt(n)
r = 0
o = 1 << ((n.bit_length >> 1 ) << 1)
while o != 0 do
t = r + o
if n >= t
n -= t
r = (r >> 1) + o
else
r >>= 1
end
o >>= 2
end
r
end
def bench(n)
Benchmark.ips do |x|
puts "integer squareroot tests for n = #{n}"
x.report("iroot2" ) { n.iroot2 }
x.report("isqrt" ) { isqrt(n) }
x.compare!
end
nil
end
bench(41)
integer squareroot tests for n = 10 ** 41
Warming up --------------------------------------
iroot2 3.264k i/100ms
isqrt 3.088k i/100ms
Calculating -------------------------------------
iroot2 33.749k (± 0.6%) i/s - 169.728k in 5.029342s
isqrt 31.469k (± 1.4%) i/s - 157.488k in 5.005489s
Comparison:
iroot2: 33748.7 i/s
isqrt: 31469.0 i/s - 1.07x slower
=> nil
irb(main):245:0> bench(500)
integer squareroot tests for n = 10 ** 500
Warming up --------------------------------------
iroot2 100.000 i/100ms
isqrt 97.000 i/100ms
Calculating -------------------------------------
iroot2 1.002k (± 0.7%) i/s - 5.100k in 5.092307s
isqrt 976.542 (± 0.8%) i/s - 4.947k in 5.066141s
Comparison:
iroot2: 1001.6 i/s
isqrt: 976.5 i/s - 1.03x slower
=> nil
irb(main):246:0> bench(1000)
integer squareroot tests for n = 10 ** 1000
Warming up --------------------------------------
iroot2 31.000 i/100ms
isqrt 38.000 i/100ms
Calculating -------------------------------------
iroot2 322.487 (± 1.6%) i/s - 1.643k in 5.095980s
isqrt 392.904 (± 1.3%) i/s - 1.976k in 5.029970s
Comparison:
isqrt: 392.9 i/s
iroot2: 322.5 i/s - 1.22x slower
=> nil
irb(main):247:0> bench(2000)
integer squareroot tests for n = 10 ** 2000
Warming up --------------------------------------
iroot2 7.000 i/100ms
isqrt 15.000 i/100ms
Calculating -------------------------------------
iroot2 74.351 (± 1.3%) i/s - 378.000 in 5.084339s
isqrt 154.343 (± 1.3%) i/s - 780.000 in 5.054739s
Comparison:
isqrt: 154.3 i/s
iroot2: 74.4 i/s - 2.08x slower
=> nil
irb(main):248:0> bench(4000)
integer squareroot tests for n = 10 ** 4000
Warming up --------------------------------------
iroot2 1.000 i/100ms
isqrt 5.000 i/100ms
Calculating -------------------------------------
iroot2 15.017 (± 0.0%) i/s - 76.000 in 5.061323s
isqrt 56.217 (± 1.8%) i/s - 285.000 in 5.070483s
Comparison:
isqrt: 56.2 i/s
iroot2: 15.0 i/s - 3.74x slower
=> nil
```

It is interesting to note that isqrt is actually slower in the low end--this is because the ruby interpreter is doing more per loop. This in spite of the fact that this method does not require multiplies!

But back to my earlier question: if we want folks to use Ruby for multi-precision work, why not encourage them to link to GMP? (And use GMP's sqrt?) We've supported this since 2.1. I know that the square root there will be much, much faster. If we want to implement a non-GMP solution (because of the license issue), then we have to decide how much work to put into it.

I've read Zimmerman's paper (he appears to be the author of the GMP code), https://hal.inria.fr/inria-00072854/document, and it does not look too hard to implement, assuming that we have the primitives already in place. Implementing the primitives would not be overly difficult, but again, we need to start asking just how far to go. Any serious MP work must drop down to asm, because of things like the carry bit and full-word multiplication.

I'm sorry to have confused you about Newton's method. The fast way to compute square roots using Newton's method is to compute 1/sqrt(n), then multiply by n. (This can be made to always give the correct answer.) This is called "Newton-Raphson" in the literature. It has been amazingly hard to find an mp integer implementation anywhere, and I'm good enough to know better than to trust something that I just put together.

#### Updated by jzakiya (Jabari Zakiya) over 7 years ago

Ah, these are good and interesting results Nathan. Thanks for posting.

It would be interesting to see how both faired as primitive implementations.

I will definitely study the code and play with it.

However **bbm** may still be the smallest/efficient/fastest way to exactly do

roots > 2, and can be used as the benchmark algorithm to test others against.

It certainly eliminates having to create specific optimal implementations for

individual roots > 2.

However, here's another reason this issue is important to the larger Ruby 3x3 goal.

In the library file **prime.rb** is the method **Integer#prime?**.

```
class Integer
...
...
def prime?
return self >= 2 if self <= 3
return true if self == 5
return false unless 30.gcd(self) == 1
(7..Math.sqrt(self).to_i).step(30) do |p|
return false if
self%(p) == 0 || self%(p+4) == 0 || self%(p+6) == 0 || self%(p+10) == 0 ||
self%(p+12) == 0 || self%(p+16) == 0 || self%(p+22) == 0 || self%(p+24) == 0
end
true
end
...
...
```

As in typical prime algorithms, you just check for the primes <= the number's squareroot.

Here **Math.sqrt(self).to_i** determines the integer squareroot, but as we know (and as I

found out, which initiated all this), as the numbers gets `> ~10**28 `

, the approximation for

the squareroot becomes more and more larger than the actual value. This will cause **prime?**

to do increasingly more work than necessary as the numbers get bigger.

(Completely as an aside, for numbers this size you should probably look to use: n.to_bn.prime?)

In **prime.rb** I counted three places where **Math.sqrt** is used to find the integer squareroot.

Creating a primitive integer squareroot method would not only make these results exact, but smaller

as n gets bigger, increasing the accuracy and speed of any application which uses them.

It would be interesting to see what an audit of the entire Ruby code base would reveal where

this is used to provide the integer squareroot, versus the floating point one.

But also to Nathan's point, until I started looking for better alternatives to work with larger

numbers, I was totally unaware of the resources in the **openssl** library that can be applied

to arbitrary sized integers. Since these resource already exist within the Ruby ecosystem, maybe

a first step is to assess and test how well these existing resources meet these needs and

then standardize using them ubiquitously within the core library to optimize their performance.

From a users perspective, I first want accurate results, then speed. A fast (or slow) incorrect

result is of no value in doing mathematical or numerical analysis heavy applications.

#### Updated by tompng (tomoya ishida) over 7 years ago

Newton seems to be faster than bbm, if the initial x is closer to √n.

when I use x=1<<((n.bit_length+1)/2) for the initial x in newton's method,

the iteration count (n=10**4000) became 11 (was 6656 when initial x is n).

I think the calculation cost is:

```
newton: O(log(k) * k**1.585))
Nathan Zook's isqrt: O(k**2)
bbm: O(k * k**1.585))
k=log(n)
bignum mult: O(k**1.585) # Bignum uses karatsuba algorithm
```

benchmark

```
class Integer
def irootn(n)
return nil if self < 0 && n.even?
raise "root n is < 2 or not an Integer" unless n.is_a?(Integer) && n > 1
num = self.abs
bits_shift = (num.bit_length)/n + 2
root, bitn_mask = 0, (1 << bits_shift)
until (bitn_mask >>= 1) == 0
root |= bitn_mask
root ^= bitn_mask if root**n > num
end
root *= (self < 0 ? -1 : 1)
end
def iroot2; irootn(2) end
end
def intsqrt7(n) # Newton's method
x = n
y = (n + 1)/2
while y < x
x = y
y = (x + n/x)/2
end
x
end
def intsqrt7_with_good_initial_value(n) # Newton's method with √n < initial_x <= 2*√n
raise if n<0
return n if n<=1
x = 1<<((n.bit_length+1)/2)
y = (x + n/x)/2
while y < x
x = y
y = (x + n/x)/2
end
x
end
require "benchmark/ips"
raise unless 100000.times.all?{|i|intsqrt7_with_good_initial_value(i) == i.iroot2}
Benchmark.ips do |x|
exp = 4000; n = 10**exp
puts "integer squareroot tests for n = 10**#{exp}"
x.report("iroot2" ) { n.iroot2 }
x.report("newtons" ) { intsqrt7(n) }
x.report("newtons_fast" ) { intsqrt7_with_good_initial_value(n) }
x.compare!
end
```

result

```
integer squareroot tests for n = 10**4000
Warming up --------------------------------------
iroot2 1.000 i/100ms
newtons 1.000 i/100ms
newtons_fast 90.000 i/100ms
Calculating -------------------------------------
iroot2 13.054 (±15.3%) i/s - 64.000 in 5.063850s
newtons 2.455 (± 0.0%) i/s - 13.000 in 5.337181s
newtons_fast 939.494 (± 7.3%) i/s - 4.680k in 5.022312s
Comparison:
newtons_fast: 939.5 i/s
iroot2: 13.1 i/s - 71.97x slower
newtons: 2.5 i/s - 382.74x slower
```

#### Updated by akr (Akira Tanaka) over 7 years ago

Jabari Zakiya wrote:

In the library file

prime.rbis the methodInteger#prime?.`class Integer ... ... def prime? return self >= 2 if self <= 3 return true if self == 5 return false unless 30.gcd(self) == 1 (7..Math.sqrt(self).to_i).step(30) do |p| return false if self%(p) == 0 || self%(p+4) == 0 || self%(p+6) == 0 || self%(p+10) == 0 || self%(p+12) == 0 || self%(p+16) == 0 || self%(p+22) == 0 || self%(p+24) == 0 end true end ... ...`

As in typical prime algorithms, you just check for the primes <= the number's squareroot.

HereMath.sqrt(self).to_idetermines the integer squareroot, but as we know (and as I

found out, which initiated all this), as the numbers gets`> ~10**28`

, the approximation for

the squareroot becomes more and more larger than the actual value. This will causeprime?

to do increasingly more work than necessary as the numbers get bigger.

(Completely as an aside, for numbers this size you should probably look to use: n.to_bn.prime?)

Math.sqrt(n) can be smaller than mathematical sqrt(n).

```
% ruby -ve '1.upto(50) {|i| j = 10**i; j2 = Math.sqrt(j*j).to_i; p [i, j2 - j] if j != j2 }'
ruby 2.5.0dev (2017-02-21 trunk 57675) [x86_64-linux]
[23, -8388608]
[24, -16777216]
[25, 905969664]
[26, 4764729344]
[27, 13287555072]
[28, -416880263168]
[29, -8566849142784]
[30, 19884624838656]
[31, -364103705034752]
[32, 5366162204393472]
[33, -54424769012957184]
[34, -544247690129571840]
[35, -3136633892082024448]
[36, 42420637374017961984]
[37, -461237341797878857728]
[38, -2251190176543965970432]
[39, -60290833628396821413888]
[40, 303786028427003666890752]
[41, 620008645040778319495168]
[42, 44885712678075916785549312]
[43, 139372116959414099130712064]
[44, -10985679223259661757684121600]
[45, -70242710975464448780069240832]
[46, -68601809640529787052340805632]
[47, 4384584304507619735463404765184]
[48, 43845843045076197354634047651840]
[49, -535097230524518206803127585210368]
[50, 7629769841091887003294964970946560]
```

So, Integer#prime? can underestimate the limit to search factors.

Assume i is 1000000000000000031.

I think (i*i).prime? returns true.
1000000000000000031 is not searched because
Math.sqrt(i*i).to_i is 1000000000000000000.

(I couldn't complete (i*i).prime? because it takes too long time.)

```
% ruby -e 'i=1000000000000000031; p i, Math.sqrt(i**2).to_i'
1000000000000000031
1000000000000000000
```

I feel this is a good evidence that we need integer sqrt.

It would be interesting to see what an audit of the entire Ruby code base would reveal where

this is used to provide the integer squareroot, versus the floating point one.

Simple search result of 'sqrt(.*to_i' using gem-codesearch.

https://github.com/akr/gem-codesearch

```
Oompa-euler-1.0.3/lib/euler.rb: max = Math.sqrt(to_factor).to_i + 1
Oompa-euler-1.0.3/lib/euler.rb: 5.step(Math.sqrt(self).to_i, 2) do |x|
Oompa-euler-1.0.3/lib/euler.rb: 3.step(Math.sqrt(max).to_i, 2) { |i|
algebra-0.2.3/lib/algebra/polynomial-factor-int.rb: (Math.log(c * Math.sqrt(s))/Math.log(u)).to_i + 1
blackwinter-gsl-1.15.3.2/examples/random/randomwalk.rb:sigma = Math::sqrt(N).to_i
boggle-0.0.2/bin/boggle:opts[:size] = Math.sqrt(opts[:board].size).to_i if !opts[:board].empty?
boggler-0.0.1/lib/boggler/grid.rb: @size = Math.sqrt(@dice.length).to_i
boqwij-0.1.0/lib/boqwij/integer_extensions.rb: 5.step(Math.sqrt(self).to_i, 2) do |x|
boqwij-0.1.0/lib/boqwij/integer_extensions.rb: max = Math.sqrt(to_factor).to_i + 1
clusterer-0.1.9/lib/clusterer/clustering.rb: options[:no_of_clusters] ||= Math.sqrt(objects.size).to_i
collective-0.2.1/lib/collective/checker.rb: @estimate > 0 ? Math.sqrt(@estimate).to_i : 1
cw-0.3.3/lib/cw/read.rb: # @magnitude = Math.sqrt(magnitude_squared).to_i
euler-1.1.0/lib/euler.rb: max = Math.sqrt(to_factor).to_i + 1
euler-1.1.0/lib/euler.rb: 5.step(Math.sqrt(self).to_i, 2) do |x|
euler-1.1.0/lib/euler.rb: 3.step(Math.sqrt(max).to_i, 2) { |i|
euler-1.1.0/lib/rudoku.rb: @sqrt_n = Math.sqrt(@n).to_i
gecoder-1.1.1/lib/gecoder/interface/constraints/int/arithmetic.rb: Gecode::Raw::sqrt(@model.active_space, int_op.to_int_var.bind,
gecoder-with-gecode-1.1.1.1/lib/gecoder/interface/constraints/int/arithmetic.rb: Gecode::Raw::sqrt(@model.active_space, int_op.to_int_var.bind,
gphys-1.5.1/lib/numru/gphys/interpolate.rb: ndiv[i] = Math.sqrt( (kx[i+1]-kx[i])**2 + (ky[i+1]-ky[i])**2).to_i
green_shoes-1.1.374/samples/sample56.rb: .text(lambda {|d| d.class_name[0, Math.sqrt(d.node_value).to_i / 8]})
gsl-2.1.0.2/examples/random/randomwalk.rb:sigma = Math::sqrt(N).to_i
gsl-nmatrix-1.15.3.2/examples/random/randomwalk.rb:sigma = Math::sqrt(N).to_i
hotcocoa-0.6.3/lib/hotcocoa/graphics/elements/sandpainter.rb: #@grains = (sqrt((ox-x)*(ox-x)+(oy-y)*(oy-y))).to_i
huge_enumerable-0.0.2/lib/huge_enumerable.rb: increment = next_prime(( 2 * domain_size / (1 + Math.sqrt(5)) ).to_i)
ifmapper-2.0.4/lib/IFMapper/FXSpline.rb: num = Math.sqrt( x2 + y2 ).to_i / 16
jwthumbs-0.1.0/lib/shutter.rb: gridsize = Math.sqrt(@images.length).ceil.to_i
lazylist-0.3.2/examples/examples.rb: not (2..Math.sqrt(x).to_i).find { |d| x % d == 0 }
magic_cloud-0.0.4/lib/magic_cloud/layouter/place.rb: case (Math.sqrt(1 + 4 * sign * t) - sign).to_i & 3
mapbaidu-0.0.7/lib/map_baidu.rb: x = Math.sqrt(radius**2 - y**2).ceil.to_i
mapcache-0.1.1/lib/mgmaps_export.rb: tpf_x = Math.sqrt(tpf).to_i
mapcache-0.1.1/lib/mgmaps_export.rb: tpf_x = Math.sqrt(tpf * 2).to_i
matrix_disp-0.0.3/lib/matrix_disp.rb: for i in (0...Math.sqrt(other[0].size).to_i) do
matrix_disp-0.0.3/lib/matrix_disp.rb: for x in (0...Math.sqrt(other[0].size).to_i) do
matrix_disp-0.0.3/lib/matrix_disp.rb: for y in (0...Math.sqrt(other[0].size).to_i) do
matrix_disp-0.0.3/lib/matrix_disp.rb: adj[z] = other[0][x+(y*Math.sqrt(other[0].size).to_i)]
matrix_disp-0.0.3/lib/matrix_disp.rb: resultado = resultado + tmp[i+(j*Math.sqrt(other[0].size).to_i)]*det(adj)
matrix_disp-0.0.3/lib/matrix_disp.rb: resultado = resultado - tmp[i+(j*Math.sqrt(other[0].size).to_i)]*det(adj)
minesweeper-console-1.0.1/test/minesweeper/console/prettyprinter/minefield_pretty_printer_test.rb: Math.sqrt(@string_representation.length).to_i
mlanett-hive-0.4.0/lib/hive/checker.rb: @estimate > 0 ? Math.sqrt(@estimate).to_i : 1
mtah-ruby-treemap-0.0.3.2/lib/treemap/node.rb: (base_size * Math.sqrt(self.bounds.width * self.bounds.height) / 125).to_i
naga_perfectsquare-1.0.1/lib/naga_perfectsquare.rb: Math.sqrt(self) == Math.sqrt(self).to_i
narray-0.6.1.2/src/lib/narray/narray_ext.rb: m = ((n+Math::sqrt(n))*1.27).to_i
narray-bigmem-0.0.0/lib/narray_ext.rb: m = ((n+Math::sqrt(n))*1.27).to_i
nsudoku-0.2.0/lib/nsudoku/checker.rb: @width = Math.sqrt(@sudoku.length).to_i
nsudoku-0.2.0/lib/nsudoku/checker.rb: @block_width = Math.sqrt(@width).to_i
nsudoku-0.2.0/lib/nsudoku/solver.rb: @width = Math.sqrt(sudoku.length).to_i
nsudoku-0.2.0/lib/nsudoku/solver.rb: @block_width = Math.sqrt(@width).to_i
nsudoku-0.2.0/lib/nsudoku.rb: @width = Math.sqrt(@sudoku.length).to_i
nsudoku-0.2.0/lib/nsudoku.rb: @block_width = Math.sqrt(@width).to_i
numb-0.186.0/lib/numb/sum_of_squares.rb: s = Math.sqrt(i).to_i
numeric_memoist-0.0.2/examples/isqrt_example.rb: $stderr.puts "calculate sqrt(#{self}).to_i (<= MaxIntFloat)"
numeric_memoist-0.0.2/examples/isqrt_example.rb: return Math.sqrt(self).to_i
numeric_memoist-0.0.2/examples/isqrt_example.rb: $stderr.puts "calculate sqrt(#{self}).to_i"
numeric_memoist-0.0.2/examples/isqrt_example.rb: # STDERR> calculate sqrt(123456789).to_i (<= MaxIntFloat)
numeric_memoist-0.0.2/examples/isqrt_example.rb: # STDERR> calculate sqrt(986960440108935861883449099987613301336842162813430234017512025).to_i
numru-narray-1.0.3/lib/numru/narray_ext.rb: m = ((n+Math::sqrt(n))*1.27).to_i
pixeldistance-0.2.1/lib/pixeldistance.rb: Math.sqrt((x1-x2)**2 + (y1-y2)**2).to_i >> (21 - zoom)
prime-multiplication-1.0/lib/PrimeMultiplication.rb: (2..(Math.sqrt(number).to_i)).each do |i|
prime_numbers-0.0.4/lib/prime_numbers/prime_int.rb: 2.upto(Math.sqrt(num).to_i) do |x|
prime_table-0.0.2/lib/prime_table/prime.rb: sqrt = Math.sqrt(self).to_i
primes-utils-2.7.0/lib/primes/utils.rb: sqrtN = Math.sqrt(n).to_i
primes-utils-2.7.0/lib/primes/utils.rb: sqrtN = Math.sqrt(n).to_i
primes-utils-2.7.0/lib/primes/utils.rb: factors << p; r -=1; n /= p; sqrtN = Math.sqrt(n).to_i
primes-utils-2.7.0/lib/primes/utils.rb: sqrtN = Math.sqrt(num).to_i # sqrt of end_num (end of range)
primes-utils-2.7.0/lib/primes/utils.rb: if start_num <= Math.sqrt(num).to_i # for one array of primes upto N
print_square-0.0.2/lib/print_square/command_runner.rb: size = Math.sqrt(number).to_i
quasi-0.0.6/lib/halton.rb: m=Math.sqrt(q).to_int
rlsm-1.8.1/lib/rlsm/monoid.rb: @order = Math::sqrt(@table.size).to_i
royw-drbman-0.0.6/examples/primes/lib/primes/sieve_of_eratosthenes.rb: sqr_primes = primes(Math.sqrt(maximum).to_i, drbman)
ruby-compiler-0.1.1/vendor/ruby/lib/prime.rb: (7..Math.sqrt(self).to_i).step(30) do |p|
ruby-spark-1.2.1/benchmark/comparison/ruby.rb: upper = Math.sqrt(x.to_f).to_i
ruby-spark-1.2.1/lib/spark/rdd.rb: max_sample_size = Integer::MAX - (num_st_dev * Math.sqrt(Integer::MAX)).to_i
ruby-treemap-0.0.4/lib/treemap/node.rb: (base_size * Math.sqrt(self.bounds.width * self.bounds.height) / 125).to_i
ruby-treemap-fork-0.0.4/lib/treemap/node.rb: (base_size * Math.sqrt(self.bounds.width * self.bounds.height) / 125).to_i
rubyvis-0.6.1/examples/5_pv_hierarchies/bubble_charts.rb: .text(lambda {|d| d.class_name[0, Math.sqrt(d.node_value).to_i / 8]})
rusby-0.1.2/lib/rusby/profiler.rb: SQRT_ITERATIONS = Math.sqrt(1000).to_i
simstring_pure-1.1.1/lib/simstring_pure.rb: (alpha * Math.sqrt(query_size * y_size)).ceil.to_i
sudoku_maker-0.0.2/lib/sudoku_maker/maker.rb: @decision_num = Math.sqrt(num).to_i
sudoku_solver-1.0.0/lib/sudoku_solver/solver.rb: @dimension = Math.sqrt(@width).to_i
tactical_tic_tac_toe-1.0.0/lib/board.rb: Math.sqrt(marked_spaces.count).to_i
taskjuggler-3.6.0/lib/taskjuggler/reports/ChartPlotter.rb: rr = (r / Math.sqrt(2.0)).to_i
tdiary-contrib-5.0.3/plugin/category_to_tagcloud.rb: level = ((Math.sqrt(count) - min) * factor).to_i
tic_tac_toes-0.1.8/lib/tic_tac_toes/ui/serializer.rb: row_size = Math.sqrt(board_structure.count).to_i
ul-wukong-4.1.1/lib/wukong/widget/reducers/bin.rb: self.num_bins = Math.sqrt(total_count).to_i
whitestone-1.0.2/etc/extra_tests/output_examples_code.rb: (2..Math.sqrt(n).to_i).map { |i| n % i }.all? { |rem| rem != 0 }
wukong-4.0.0/lib/wukong/widget/reducers/bin.rb: self.num_bins = Math.sqrt(total_count).to_i
```

#### Updated by jzakiya (Jabari Zakiya) over 7 years ago

I did a simplifications in ** isqrt1**, replacing all the '+'s with '|' operations,

leaving only the math subtraction operation

`n -= t `

. I'll see if I can get rid of that.This makes it a bit faster, especially for numbers `< 10**400`

, where it starts being

faster than ** iroot2**.

This suggests, for optimal overall speed, that a hybrid method can be crafted, like I

proposed for using **Math.sqrt(n).to_i**, like:

```
def sqrt_i(n); n < UPPER_BOUND ? n.iroot2 : isqrt1(n) end
class Integer
def irootn(n)
return nil if self < 0 && n.even?
raise "root n is < 2 or not an Integer" unless n.is_a?(Integer) && n > 1
num = self.abs
bits_shift = (num.bit_length)/n + 2
root, bitn_mask = 0, (1 << bits_shift)
until (bitn_mask >>= 1) == 0
root |= bitn_mask
root ^= bitn_mask if root**n > num
end
root *= (self < 0 ? -1 : 1)
end
def iroot2; irootn(2) end
end
def isqrt(n)
r = 0
x = 1 << ((n.bit_length >> 1 ) << 1)
while x != 0 do
t = r + x
if n >= t
n -= t
r = (r >> 1) + x
else
r >>= 1
end
x >>= 2
end
r
end
def isqrt1(n)
r = 0
x = 1 << ((n.bit_length >> 1 ) << 1)
until x == 0
t = r | x
r >>= 1
(n -= t; r |= x) if n >= t
x >>= 2
end
r
end
Benchmark.ips do |x|
exp = 400; n = 10**exp
puts "integer squareroot tests for n = 10**#{exp}"
x.report("iroot2" ) { n.iroot2 }
# x.report("irootn(2)" ) { n.irootn(2) }
x.report("isqrt(n)" ) { isqrt(n) }
x.report("isqrt1(n)" ) { isqrt1(n) }
x.compare!
end
```

```
2.4.0 :118 > load 'irootstest.rb'
integer squareroot tests for n = 10**50
Warming up --------------------------------------
iroot2 4.157k i/100ms
isqrt(n) 3.590k i/100ms
isqrt1(n) 4.029k i/100ms
Calculating -------------------------------------
iroot2 41.870k (± 3.7%) i/s - 212.007k in 5.070841s
isqrt(n) 36.039k (± 3.6%) i/s - 183.090k in 5.087128s
isqrt1(n) 40.580k (± 3.4%) i/s - 205.479k in 5.069789s
Comparison:
iroot2: 41870.0 i/s
isqrt1(n): 40579.8 i/s - same-ish: difference falls within error
isqrt(n): 36039.1 i/s - 1.16x slower
=> true
2.4.0 :119 > load 'irootstest.rb'
integer squareroot tests for n = 10**100
Warming up --------------------------------------
iroot2 1.385k i/100ms
isqrt(n) 879.000 i/100ms
isqrt1(n) 883.000 i/100ms
Calculating -------------------------------------
iroot2 14.004k (± 3.2%) i/s - 70.635k in 5.049521s
isqrt(n) 8.558k (± 3.9%) i/s - 43.071k in 5.040986s
isqrt1(n) 9.173k (± 3.8%) i/s - 45.916k in 5.012857s
Comparison:
iroot2: 14003.5 i/s
isqrt1(n): 9173.2 i/s - 1.53x slower
isqrt(n): 8557.9 i/s - 1.64x slower
=> true
2.4.0 :120 > load 'irootstest.rb'
integer squareroot tests for n = 10**200
Warming up --------------------------------------
iroot2 425.000 i/100ms
isqrt(n) 377.000 i/100ms
isqrt1(n) 393.000 i/100ms
Calculating -------------------------------------
iroot2 4.248k (± 4.1%) i/s - 21.250k in 5.010948s
isqrt(n) 3.722k (± 4.5%) i/s - 18.850k in 5.075627s
isqrt1(n) 3.876k (± 4.9%) i/s - 19.650k in 5.081749s
Comparison:
iroot2: 4248.2 i/s
isqrt1(n): 3876.3 i/s - 1.10x slower
isqrt(n): 3721.7 i/s - 1.14x slower
=> true
2.4.0 :121 > load 'irootstest.rb'
integer squareroot tests for n = 10**300
Warming up --------------------------------------
iroot2 252.000 i/100ms
isqrt(n) 219.000 i/100ms
isqrt1(n) 238.000 i/100ms
Calculating -------------------------------------
iroot2 2.477k (± 6.5%) i/s - 12.348k in 5.008024s
isqrt(n) 2.263k (± 6.6%) i/s - 11.388k in 5.060793s
isqrt1(n) 2.407k (± 4.0%) i/s - 12.138k in 5.050951s
Comparison:
iroot2: 2476.8 i/s
isqrt1(n): 2407.2 i/s - same-ish: difference falls within error
isqrt(n): 2262.7 i/s - same-ish: difference falls within error
=> true
2.4.0 :122 > load 'irootstest.rb'
integer squareroot tests for n = 10**400
Warming up --------------------------------------
iroot2 101.000 i/100ms
isqrt(n) 156.000 i/100ms
isqrt1(n) 165.000 i/100ms
Calculating -------------------------------------
iroot2 972.069 (± 6.0%) i/s - 4.848k in 5.006825s
isqrt(n) 1.495k (± 7.3%) i/s - 7.488k in 5.038471s
isqrt1(n) 1.611k (± 7.0%) i/s - 8.085k in 5.045823s
Comparison:
isqrt1(n): 1610.8 i/s
isqrt(n): 1494.9 i/s - same-ish: difference falls within error
iroot2: 972.1 i/s - 1.66x slower
```

#### Updated by jzakiya (Jabari Zakiya) over 7 years ago

Thanks Akira for that list.

So, too small integer squareroots potentially can cause miss finding primes in some apps,

and being too large can create doing more work than necessary in some apps, and just being

wrong in other apps has unknown consequences. :(

I know how this affected my specific apps, and was diligent (lucky) enough to track it down.

How many other situations are out there being affected by this that people don't even know

are giving wrong answers, or know something is amiss and have no hint of what's causing it?

#### Updated by jzakiya (Jabari Zakiya) over 7 years ago

So the clear winner, and new champeen is...** newtons_fast**, at least for squareroots.

Just goes to show, don't f**k with Newton!

```
class Integer
def irootn(n)
return nil if self < 0 && n.even?
raise "root n is < 2 or not an Integer" unless n.is_a?(Integer) && n > 1
num = self.abs
bits_shift = (num.bit_length)/n + 2
root, bitn_mask = 0, (1 << bits_shift)
until (bitn_mask >>= 1) == 0
root |= bitn_mask
root ^= bitn_mask if root**n > num
end
root *= (self < 0 ? -1 : 1)
end
def iroot2; irootn(2) end
end
def isqrt1(n)
r = 0
x = 1 << ((n.bit_length >> 1 ) << 1)
until x == 0
t = r | x
r >>= 1
(n -= t; r |= x) if n >= t
x >>= 2
end
r
end
def newtons_fast(n) # Newton's method with √n < initial_x <= 2*√n
raise if n<0
return n if n<=1
x = 1<<((n.bit_length+1)/2)
y = (x + n/x)/2
while y < x
x = y
y = (x + n/x)/2
end
x
end
Benchmark.ips do |x|
exp = 400; n = 10**exp
puts "integer squareroot tests for n = 10**#{exp}"
x.report("iroot2" ) { n.iroot2 }
x.report("isqrt1(n)" ) { isqrt1(n) }
x.report("newtons_fast(n)") { newtons_fast(n)
x.compare!
end
```

```
2.4.0 :138 > load 'irootstest.rb'
integer squareroot tests for n = 10**5
Warming up --------------------------------------
iroot2 85.000k i/100ms
isqrt1(n) 87.646k i/100ms
newtons_fast(n) 191.796k i/100ms
Calculating -------------------------------------
iroot2 1.014M (± 2.9%) i/s - 5.100M in 5.031392s
isqrt1(n) 1.049M (± 3.0%) i/s - 5.259M in 5.018645s
newtons_fast(n) 2.846M (± 4.6%) i/s - 14.385M in 5.066760s
Comparison:
newtons_fast(n): 2845709.1 i/s
isqrt1(n): 1048817.1 i/s - 2.71x slower
iroot2: 1014498.5 i/s - 2.81x slower
=> true
2.4.0 :139 > load 'irootstest.rb'
integer squareroot tests for n = 10**50
Warming up --------------------------------------
iroot2 4.077k i/100ms
isqrt1(n) 3.992k i/100ms
newtons_fast(n) 24.377k i/100ms
Calculating -------------------------------------
iroot2 41.189k (± 6.1%) i/s - 207.927k in 5.071463s
isqrt1(n) 40.293k (± 3.4%) i/s - 203.592k in 5.058807s
newtons_fast(n) 260.960k (± 3.4%) i/s - 1.316M in 5.050191s
Comparison:
newtons_fast(n): 260959.7 i/s
iroot2: 41188.8 i/s - 6.34x slower
isqrt1(n): 40292.7 i/s - 6.48x slower
=> true
2.4.0 :140 > load 'irootstest.rb'
integer squareroot tests for n = 10**500
Warming up --------------------------------------
iroot2 74.000 i/100ms
isqrt1(n) 127.000 i/100ms
newtons_fast(n) 3.775k i/100ms
Calculating -------------------------------------
iroot2 741.359 (± 4.6%) i/s - 3.700k in 5.001895s
isqrt1(n) 1.258k (± 7.0%) i/s - 6.350k in 5.077988s
newtons_fast(n) 37.762k (± 4.5%) i/s - 188.750k in 5.008756s
Comparison:
newtons_fast(n): 37761.5 i/s
isqrt1(n): 1257.6 i/s - 30.03x slower
iroot2: 741.4 i/s - 50.94x slower
=> true
2.4.0 :141 > load 'irootstest.rb'
integer squareroot tests for n = 10**5000
Warming up --------------------------------------
iroot2 1.000 i/100ms
isqrt1(n) 5.000 i/100ms
newtons_fast(n) 340.000 i/100ms
Calculating -------------------------------------
iroot2 11.196 (± 8.9%) i/s - 56.000 in 5.013699s
isqrt1(n) 53.056 (± 3.8%) i/s - 265.000 in 5.002572s
newtons_fast(n) 3.402k (± 3.1%) i/s - 17.000k in 5.002400s
Comparison:
newtons_fast(n): 3401.9 i/s
isqrt1(n): 53.1 i/s - 64.12x slower
iroot2: 11.2 i/s - 303.85x slower
```

#### Updated by jzakiya (Jabari Zakiya) over 7 years ago

Another reason this needs to change is because **Math.sqrt(n).to_i**

has an upper limit before crashing.

Here's something else to consider:

There is all that code out there using **Math.sqt(n).to_i** and I presume

most people expect it to return the correct result (which many probably

don't even realize what that should be).

Is there some possible way for the smart Ruby Gurus to fix the parser so

that when it sees the phrase **Math.sqrt(xyz).to_i** it uses whatever

chosen correct implementation decided upon, so all that old legacy code

out there, in the core, gems, and anywhere else, will just magically work?

I know it's a hack, but Ruby is designed to make programmers happy, and it

wasn't their fault this bug exists. I know it would make me happy to not

have to go back over everything I've done and replace it with a new method name.

Just something to consider.

```
2.4.0 :166 > load 'irootstest.rb'
integer squareroot tests for n = 10**308
Warming up --------------------------------------
newtons_fast(n) 5.042k i/100ms
Math.sqrt(n).to_i 167.539k i/100ms
Calculating -------------------------------------
newtons_fast(n) 51.171k (± 4.3%) i/s - 257.142k in 5.034762s
Math.sqrt(n).to_i 2.411M (± 6.1%) i/s - 12.063M in 5.022522s
Comparison:
Math.sqrt(n).to_i: 2411213.7 i/s
newtons_fast(n): 51170.9 i/s - 47.12x slower
=> true
2.4.0 :167 > load 'irootstest.rb'
integer squareroot tests for n = 10**309
Warming up --------------------------------------
newtons_fast(n) 4.190k i/100ms
Math.sqrt(n).to_iFloatDomainError: Infinity
from irootstest.rb:93:in `to_i'
from irootstest.rb:93:in `block (2 levels) in <top (required)>'
from /home/jzakiya/.rvm/gems/ruby-2.4.0/gems/benchmark-ips-2.7.2/lib/benchmark/ips/job/entry.rb:53:in `call_times'
from /home/jzakiya/.rvm/gems/ruby-2.4.0/gems/benchmark-ips-2.7.2/lib/benchmark/ips/job.rb:220:in `block in run_warmup'
from /home/jzakiya/.rvm/gems/ruby-2.4.0/gems/benchmark-ips-2.7.2/lib/benchmark/ips/job.rb:206:in `each'
from /home/jzakiya/.rvm/gems/ruby-2.4.0/gems/benchmark-ips-2.7.2/lib/benchmark/ips/job.rb:206:in `run_warmup'
from /home/jzakiya/.rvm/gems/ruby-2.4.0/gems/benchmark-ips-2.7.2/lib/benchmark/ips/job.rb:185:in `block in run'
from /home/jzakiya/.rvm/gems/ruby-2.4.0/gems/benchmark-ips-2.7.2/lib/benchmark/ips/job.rb:184:in `times'
from /home/jzakiya/.rvm/gems/ruby-2.4.0/gems/benchmark-ips-2.7.2/lib/benchmark/ips/job.rb:184:in `run'
from /home/jzakiya/.rvm/gems/ruby-2.4.0/gems/benchmark-ips-2.7.2/lib/benchmark/ips.rb:59:in `ips'
from irootstest.rb:85:in `<top (required)>'
from (irb):167:in `load'
from (irb):167
from /home/jzakiya/.rvm/rubies/ruby-2.4.0/bin/irb:11:in `<main>'
2.4.0 :168 >
```

#### Updated by Student (Nathan Zook) over 7 years ago

First, please, PLEASE do not credit me with the optimized bit-by-bit method. I'm a formally trained mathematician, and I DO NOT want credit for someone else's work. It came from the above-referenced Reddit post.

Second, it appears that we must have a pretty efficient MP divide, or any direct version of Newton's method will perform poorly. Obviously, I now need to go ahead & see what I get with Zimmerman's method...

And probably look into Newton-Raphson for comparison. As I stated, this will take work. BTW, performance of Newton's method depends a lot on N. If I come up with a rival, we will likely need a lot of tests to validate speed claims.

Third, please don't get me started on Prime.rb when it comes to large numbers. :D The implementation there is simply not to be used even for full-word numbers. Try timing (1 << 56) + 81 to see what I mean. I've actually done experiments with Miller-Rabin, and...I'll probably be submitting some code soon.

I had not thought about the implications of Math.sqrt overestimating. It is a float, so yeah.

Actually, I need to leave prime.rb alone if I'm going to work here. There is some crazy work done for small (< 2^^64) numbers...

#### Updated by Student (Nathan Zook) over 7 years ago

```
# Core Algorithm by Paul Zimmerman, article entitled
# Karatsuba Square Root
# https://hal.inria.fr/inria-00072854/document
# Presumably used in GMP.
# Personal computations derived from implementation details (page 5)
# n >= b**4 / 4
# b = 2**64**k
# n * 4 >= b ** 4
# n.length + 2 >= b.length * 4
# (n.length + 2) >> 2 >= b.length
# (n.length + 2) >> 2 >= 64 * k
# ((n.length + 2) >> 2) / 64 = k
def sqrtrem(n)
nlength = n.bit_length
if nlength >= (64 * 4 - 2)
bbits = (n.bit_length + 2 >> 2) / 64 * 64
elsif nlength >= (32 * 4 - 2)
bbits = (n.bit_length + 2 >> 2) / 32 * 32
elsif nlength >= (16 * 4 - 2)
bbits = (n.bit_length + 2 >> 2) / 16 * 16
else # single word now -- my computer has 64-bit mantissas!
s = Math.sqrt(n).to_i
r = n - s * s
return [s, r]
end
b = 1 << bbits
bmask = b - 1
a3 = n >> bbits * 3
a2 = (n >> bbits * 2) & bmask
a1 = (n >> bbits ) & bmask
a0 = n & bmask
i, j = sqrtrem((a3 << bbits) + a2)
q, u = ((j << bbits) + a1).divmod(i << 1)
s = (i << bbits) + q
r = (u << bbits) + a0 - q ** 2
if r < 0
r += (s << 1) - 1
s -= 1
end
[s, r]
end
def sqrt_z(n)
s, r = sqrtrem(n)
s
end
Benchmark.ips do |x|
n = 10 ** 5000
x.report("iroot2") { n.iroot2 }
x.report("sqrt_z") { sqrt_z(n) }
end
Warming up --------------------------------------
iroot2 1.000 i/100ms
sqrt_z 987.000 i/100ms
Calculating -------------------------------------
iroot2 8.665 (± 0.0%) i/s - 44.000 in 5.078551s
sqrt_z 10.015k (± 1.6%) i/s - 50.337k in 5.027514s
=>
```

I've been at this for way too long today.

A few notes:

- Assuming no architectural weirdness, it looks like--for this PARTICULAR case--Zimmerman's algorithm is doing about 4x better than what you are calling newtons_fast. First, this is not a surprise, as no one uses Newton's method this way. What is interesting is that Zimmerman's algorithm is designed to be good on GMP, and I am using stock ruby.
- That hack at the top can almost certainly be dramatically improved (It's my contribution). This is a demo of Zimmerman's work--which doesn't even really want to talk about numbers smaller than 2**254. Please don't bother looking at performance under 1000 bits until that is cleaned up.
- In a note that Zimmerman included in his article, he indicated that Newton-Raphson would probably be faster. As I stated, I have been unable to find an implementation, and I am loathe to present anything without high confidence that it is correct. I will do some more looking tomorrow.
- In some sense it is silly to do timing tests like this. We have no idea how much time is being spent in ruby and how much in C. It is entirely likely that we could have 10x speed improvements (or much, much more) across certain ranges for some of these methods by dropping into C. As I said, it is a question of how much work we want to do. At a minimum, we should do a GMP-linked build and see what Zimmerman's work can really do. Again, I don't know the legal considerations, and I would like guidance before attempting to reimplement an MP library from scratch.

#### Updated by Student (Nathan Zook) over 7 years ago

Sorry to spam like this, but I really have to disagree regarding some of the implications drawn for Math.sqrt(n). This returns a float. If you don't know about the pointy edges of IEEE-754, but you do care, then shame on you. But allow me to calm you regarding Numeric#prime?. Let p be the largest prime < 2 ** 32. Run (p * p).prime?. When we fix that problem, we can take care of the rounding issue.

#### Updated by Student (Nathan Zook) over 7 years ago

Following the discussion from the second answer at http://cs.stackexchange.com/questions/37596/arbitrary-precision-integer-square-root-algorithm, I implemented an integer Newton - Raphson on the inverse square root. This involved correcting an error and clarifying some things. The results were disappointing, as the asymptotic behavior is clearly beaten by the transcribed Zimmerman algorithm. The crossover was somewhat beyond 10 ** 1000. For reference, I made a faster Newton method by using Math.sqrt for the initial guess.

I also adjusted my transcription of the Zimmerman algorithm to reflect the fact that Matz implemented a simple , portable Math.sqrt. Before we use it, we would need to do serious testing, of course.

One more round of benchmarks. I have removed the warmup & comparison sections. I also implemented a newton_faster method, which uses Math.sqrt for the initial guess.

```
integer squareroot tests for n = 10**50
Calculating -------------------------------------
newtons_fast(n) 176.714k (± 1.3%) i/s - 891.495k in 5.045686s
newton_faster(n) 389.709k (± 0.8%) i/s - 1.952M in 5.009508s
sqrt_z(n) 166.894k (± 1.4%) i/s - 847.715k in 5.080312s
inverse Newton(n) 250.008k (± 0.4%) i/s - 1.271M in 5.083042s
integer squareroot tests for n = 10**500
Calculating -------------------------------------
newtons_fast(n) 31.530k (± 3.4%) i/s - 158.496k in 5.032977s
newton_faster(n) 61.087k (± 3.1%) i/s - 306.020k in 5.014480s
sqrt_z(n) 33.167k (± 0.5%) i/s - 167.128k in 5.039057s
inverse Newton(n) 51.252k (± 3.6%) i/s - 257.972k in 5.041610s
integer squareroot tests for n = 10**1000
Calculating -------------------------------------
newtons_fast(n) 14.085k (± 0.8%) i/s - 71.656k in 5.087601s
newton_faster(n) 24.078k (± 3.3%) i/s - 121.074k in 5.034512s
sqrt_z(n) 25.410k (± 2.3%) i/s - 127.500k in 5.020581s
inverse Newton(n) 28.921k (± 2.7%) i/s - 145.350k in 5.030223s
integer squareroot tests for n = 10**2000
Calculating -------------------------------------
newtons_fast(n) 3.994k (± 0.7%) i/s - 20.247k in 5.069925s
newton_faster(n) 6.698k (± 0.7%) i/s - 34.017k in 5.079015s
sqrt_z(n) 19.176k (± 2.3%) i/s - 96.951k in 5.058731s
inverse Newton(n) 13.724k (± 1.7%) i/s - 68.952k in 5.025785s
integer squareroot tests for n = 10**4000
Calculating -------------------------------------
newtons_fast(n) 1.035k (± 0.4%) i/s - 5.253k in 5.074204s
newton_faster(n) 1.600k (± 0.9%) i/s - 8.000k in 5.001751s
sqrt_z(n) 12.478k (± 0.7%) i/s - 62.475k in 5.006905s
inverse Newton(n) 6.094k (± 0.7%) i/s - 30.957k in 5.079890s
integer squareroot tests for n = 10**5000
Calculating -------------------------------------
newtons_fast(n) 628.308 (± 0.6%) i/s - 3.162k in 5.032783s
newton_faster(n) 915.131 (± 0.4%) i/s - 4.641k in 5.071494s
sqrt_z(n) 10.107k (± 0.6%) i/s - 50.796k in 5.026155s
inverse Newton(n) 4.539k (± 3.0%) i/s - 22.850k in 5.038822s
```

It is possible that "inverse Newton" could be modified to advantage the particulars of our implementation of bignum, or GMP, should it be linked. I don't want spam any more performances after this until it is determined if we want to drop down to C for this.

One last thing to play with might be Halley's method. This converges cubically, but requires 4 multiplies. This should compute faster. But don't expect to see it very soon.

#### Updated by Student (Nathan Zook) over 7 years ago

All of the algorithms are now at https://github.com/NathanZook/ruby_sqrt. Hope that's okay.

#### Updated by jzakiya (Jabari Zakiya) over 7 years ago

Hey Nathan, Great Work!

I'm seeing an error in **inverse_newton_sqrt(n)** being one-off.

```
2.4.0 :224 > n = 10**10 ; puts sqrt_z(n), newtons_fast(n), newton_faster(n), inverse_newton_sqrt(n)
100000
100000
100000
100000
=> nil
2.4.0 :225 > n = 10**11 ; puts sqrt_z(n), newtons_fast(n), newton_faster(n), inverse_newton_sqrt(n)
316227
316227
316227
316227
=> nil
2.4.0 :226 > n = 10**12 ; puts sqrt_z(n), newtons_fast(n), newton_faster(n), inverse_newton_sqrt(n)
1000000
1000000
1000000
1000000
=> nil
2.4.0 :227 > n = 10**13 ; puts sqrt_z(n), newtons_fast(n), newton_faster(n), inverse_newton_sqrt(n)
3162277
3162277
3162277
3162277
=> nil
2.4.0 :228 > n = 10**14 ; puts sqrt_z(n), newtons_fast(n), newton_faster(n), inverse_newton_sqrt(n)
10000000
10000000
10000000
10000000
=> nil
2.4.0 :229 > n = 10**15 ; puts sqrt_z(n), newtons_fast(n), newton_faster(n), inverse_newton_sqrt(n)
31622776
31622776
31622776
31622776
=> nil
2.4.0 :230 > n = 10**16 ; puts sqrt_z(n), newtons_fast(n), newton_faster(n), inverse_newton_sqrt(n)
100000000
100000000
100000000
99999999
=> nil
2.4.0 :231 > n = 10**17 ; puts sqrt_z(n), newtons_fast(n), newton_faster(n), inverse_newton_sqrt(n)
316227766
316227766
316227766
316227765
=> nil
2.4.0 :232 > n = 10**20 ; puts sqrt_z(n), newtons_fast(n), newton_faster(n), inverse_newton_sqrt(n)
10000000000
10000000000
10000000000
9999999999
=> nil
2.4.0 :233 > n = 10**30 ; puts sqrt_z(n), newtons_fast(n), newton_faster(n), inverse_newton_sqrt(n)
1000000000000000
1000000000000000
1000000000000000
1000000000000000
=> nil
2.4.0 :234 > n = 10**40 ; puts sqrt_z(n), newtons_fast(n), newton_faster(n), inverse_newton_sqrt(n)
100000000000000000000
100000000000000000000
100000000000000000000
99999999999999999999
=> nil
2.4.0 :235 > n = 10**50 ; puts sqrt_z(n), newtons_fast(n), newton_faster(n), inverse_newton_sqrt(n)
10000000000000000000000000
10000000000000000000000000
10000000000000000000000000
9999999999999999999999999
=> nil
2.4.0 :236 >
```

#### Updated by jzakiya (Jabari Zakiya) over 7 years ago

Oh I see where at least part of the error comes from.

In this

```
def inverse_newton_sqrt(n)
raise if n < 0
return Math.sqrt(n).to_i if n < 1 << 53
```

See top of this thread at `#1 `

I showed earlier I found that **Math.sqrt(n).to_i** starts given incorrect

results for n > 9_999_899_999_899_999_322_536_673_279

I replaced `if n < 1 << 53`

with `if n <= 9_999_899_999_899_999_322_536_673_279 `

This eliminated the shown errors upto n = 10**40, when it started to be one-off again.

This is on a 64-bit Linux OS I7 cpu laptop.

#### Updated by stomar (Marcus Stollsteimer) over 7 years ago

Jabari Zakiya wrote:

I showed earlier I found that

Math.sqrt(n).to_istarts given incorrect

results for n > 9_999_899_999_899_999_322_536_673_279I replaced

`if n < 1 << 53`

with`if n <= 9_999_899_999_899_999_322_536_673_279`

Is this guaranteed to be platform independent?

And a more general consideration:

I assume `Integer.sqrt()`

will be implemented in C, not in Ruby? How significant or useful are all those benchmarks then? Micro-optimizing Ruby code for speed and then reimplementing it in C seems to be the wrong approach.

#### Updated by Student (Nathan Zook) over 7 years ago

Jabari Zakiya wrote:

Oh I see where at least part of the error comes from.

In this

`def inverse_newton_sqrt(n) raise if n < 0 return Math.sqrt(n).to_i if n < 1 << 53`

See top of this thread at

`#1`

I showed earlier I found thatMath.sqrt(n).to_istarts given incorrect

results for n > 9_999_899_999_899_999_322_536_673_279

```
9999899999899999322536673279 < 1 << 53
=> false
```

So this cannot be the source of this particular problem. I am not surprised that there are errors in inverse_newton_sqrt. It was quickly put together, designed to credibly determine if it is faster than the Zimmerman algorithm. Getting the last bit correct on perfect squares is actually hard. If it appeared that it would be a superior algorithm, I would have needed to have proven convergence. I had expected that the +2 in ins_find_initial_exponent would have sufficed, but it looks like I probably needed to add another bit because of the use of fixed point rather than floating point in the algorithm.

I will credit you with finding the errors.

The question now is, "where do we go from here?" Clearly, faster Newton and Zimmerman are the contenders. At this moment, faster Newton wins almost to 10 * 1000, which is about 3000 bits, on speed alone. It is also clearly correct. (It conceivably does not converge, but that is easy to fix.)

The Zimmerman algorithm only works if I got that hack at the top correct. Actually, if we switch to a different algorithm at that point, we can probably eliminate a full pass (or two??) through the algorithm, which will speed it up lot for smaller numbers. I guess I should reach out to him & see if he will release his work to the Ruby license, in which case I can rely upon his proofs & techniques... :D

Both will benefit if we drop to C, but Zimmerman will gain a lot more. (Besides all of the ruby that happens in the loop, consider also the tricks that can be played with extracting a3-a0, for instance.)

#### Updated by Student (Nathan Zook) over 7 years ago

Marcus Stollsteimer wrote:

Jabari Zakiya wrote:

I showed earlier I found that

Math.sqrt(n).to_istarts given incorrect

results for n > 9_999_899_999_899_999_322_536_673_279I replaced

`if n < 1 << 53`

with`if n <= 9_999_899_999_899_999_322_536_673_279`

Is this guaranteed to be platform independent?

If this were the issue, it would be. Math.sqrt() runs through some IEEE-754-specificed functions in C. I know of no funny business in these particular functions.

And a more general consideration:

I assume

`Integer.sqrt()`

will be implemented in C, not in Ruby? How significant or useful are all those benchmarks then? Micro-optimizing Ruby code for speed and then reimplementing it in C seems to be the wrong approach.

I do expect that we want to go with C, but I don't have an answer. The purpose of these benchmarks has been to get some idea of which algorithms are credible. We have run large enough samples that it is clear which one have better O() performance based on the way that ruby handles large numbers. So long as we are smart, that does not depend on the ruby code.

And see my just-posted comment for more. ;)

#### Updated by Student (Nathan Zook) over 7 years ago

It is now clear that Halley's method requires five multiplies, not the four previously reported by wikipedia. This makes it uninteresting as a candidate.

#### Updated by nobu (Nobuyoshi Nakada) over 7 years ago

**Status**changed from*Open*to*Closed*

#### Updated by Student (Nathan Zook) over 7 years ago

Jabari Zakiya wrote:

Oh I see where at least part of the error comes from.

In this

`def inverse_newton_sqrt(n) raise if n < 0 return Math.sqrt(n).to_i if n < 1 << 53`

See top of this thread at

`#1`

I showed earlier I found thatMath.sqrt(n).to_istarts given incorrect

results for n > 9_999_899_999_899_999_322_536_673_279I replaced

`if n < 1 << 53`

with`if n <= 9_999_899_999_899_999_322_536_673_279`

This eliminated the shown errors upto n = 10**40, when it started to be one-off again.

This is on a 64-bit Linux OS I7 cpu laptop.

Actually, with some understanding of how floating point arithmetic works, the first even root past 1 << 53 is likely to be bad:

```
irb(main):1081:0>
def msi(n)
x = Math.sqrt(n).to_i
big = x * x > n
small = x * x + 2 * x < n
puts [n, x, big, small]
end
irb(main):1082:0> msi(1<<53)
9007199254740992
94906265
false
false
=> nil
irb(main):1083:0> msi(94906266 ** 2 - 1)
9007199326062755
94906266
true
false
=> nil
```

;)

#### Updated by jzakiya (Jabari Zakiya) over 7 years ago

I want to raise for consideration, as a user, the **API behavior** for these methods.

I noticed in the other squareroot methods they raise an error for `n < 0`

whereas my version returns 'nil' when taking an even root of a negative integer.

Let me promote why this shouldn't be treated as a runtime error.

The reason I chose to return 'nil' is because in some applications where I'm

processing lots of data I don't want the program to bomb when encountering

negative integers, I want to continue and just choose how to process those conditions.

First, mathematically it is perfectly allowable to take the even root of a negative number,

it's just a complex number, so conceptually this shouldn't be an error, just unwanted for this method.

With **irootn|2** I know I will either get a real integer, which means n was a valid integer (and root)

or 'nil', which means it didn't have a real (as in complex part) root.

If I wanted to check for negativity in the first place I could always do something like:

```
def my_app_sqrt(n); n < 0 ? Complex(0, n.abs.iroot2) : n.iroot2 end
```

or using the **root** method from my **Roots** gem:

```
def my_app_sqrt(n); n < 0 ? n.root(2) : n.iroot2 end
```

As a user, I want to choose how to handle cases when my app may try to take an even root

of a negative integer/number and not automatically throw an error that crashes my program.

I think as long as this is clearly documented (either way) for the API a user will know

exactly how to deal with these conditions, but returning 'nil' gives the user total control.

#### Updated by stomar (Marcus Stollsteimer) over 7 years ago

Jabari Zakiya wrote:

I noticed in the other squareroot methods they raise an error for

`n < 0`

whereas my version returns 'nil' when taking an even root of a negative integer.

I strongly agree with nobu's implementation. That's exactly the behavior I was about to suggest when I noticed it was already there.

The reason I chose to return 'nil' is because in some applications where I'm

processing lots of data I don't want the program to bomb when encountering

negative integers, I want to continue and just choose how to process those conditions.

It doesn't have to bomb.

- for typical use cases (number theory, etc.) you will have only positive values anyway
- you can check your data before using the method
- you can use a begin...rescue block and handle the exception appropriately

Returning `nil`

for a mathematical operation doesn't make much sense. And a silent failure you are not aware of will bite you much more than an exception.

First, mathematically it is perfectly allowable to take the even root of a negative number,

it's just a complex number, so conceptually this shouldn't be an error, just unwanted for this method.

"Not an error, but unwanted"???

The point is that by premise the method doesn't make *any* sense in the complex domain. It's supposed to return the largest integer (natural number) less or equal to the square root. This wouldn't work for complex numbers at all: how would you truncate to an integer? how would you do the comparison? (Complex numbers are not ordered.)

Therefore, isqrt only does make sense for natural numbers. (Different from the "normal" sqrt.)

#### Updated by jzakiya (Jabari Zakiya) over 7 years ago

I disagree, but as long as the API explicitly states what's happens for this case a user can adapt accordingly

#### Updated by nobu (Nobuyoshi Nakada) over 7 years ago

**Description**updated (diff)