Misc #18587: What was the reason behind Ruby choosing SipHash for Hash? - Ruby - Ruby Issue Tracking System

Actions

Copy link

Misc #18587

open

What was the reason behind Ruby choosing SipHash for Hash?

Misc #18587: What was the reason behind Ruby choosing SipHash for Hash?

Added by midnight (Sarun R) almost 4 years ago. Updated almost 4 years ago.

Status:

Open

Assignee:

[ruby-core:107600]

Description

Hello

I am digging into the history behind Ruby using SipHash for its Hash.
I found that in 2012 there were CVE-2012-5371 showing up;
the Ruby maintainers went with the decision to switch algorithms, probably, because we wanted something quick to implement at the time.
The change went live in late 2012.

Fast forward with the Ruby 3x3 initiative, we now seem to care about the performance again.
And hash DoSing does not seem to be an urgent threat now; we have time to be deliberate about Hash again.

I can't find the old discussion related to Ruby's SipHash decision.
I just found that SipHash is not the only solution to prevent hashtable DoSing.
There is an interesting discussion on golang side in late 2015:
https://github.com/golang/go/issues/9365

Just to recap, Go's authors argue that:

Cryptographic hash is not needed to construct a DoS-resistant hashtable.
If the random seed is per-hashtable bases, the attack vector exploitable from a remote adversary seems unlikely.
If we want to be extra careful about it, and since the collision is unlikely, when collision actually does occur despite the per-hashtable seed, we can handle that as a special case where we can rerandom the seed and rehash the key.
The way random seeds are folded into the hash does matter, for example, CityHash does f(g(msg), seed); in such case, collision in g will cause a collision in f because the output of g is independent of the seed.
Slowing down hashtable for everyone to prevent hard-to-exploit DoS doesn't seem to be a good trade-off.

On the actual implementation, they use AES-NI to achieve good pseudo-random functions' properties. And use some fallback non-cryptographic hashing function on the platform without AES-NI.

Now, I read the rationale on golang side, I want to understand the rationale on the Ruby side too.
I am not there 10-years-ago, and I can't find records or discussions at the time. There might be some Ruby limitations that the approach described by go's authors does not apply.

So, I asked in the hope of someone still remembering what was happening, the situation we are in 10 years ago, or the limitation of Ruby that prevents per-Hash seeds.

Actions

Copy link

Also available in: PDF Atom

Project

General

Profile

Ruby

Tags

Custom queries

Misc #18587

What was the reason behind Ruby choosing SipHash for Hash?

Updated by midnight (Sarun R) almost 4 years ago Actions
Copy link
#1 [ruby-core:107601]

Updated by Eregon (Benoit Daloze) almost 4 years ago Actions
Copy link
#2 [ruby-core:107602]

Updated by shyouhei (Shyouhei Urabe) almost 4 years ago Actions
Copy link
#3 [ruby-core:107608]

Project

General

Profile

Ruby

Tags

Custom queries

Misc #18587

What was the reason behind Ruby choosing SipHash for Hash?

Updated by midnight (Sarun R) almost 4 years ago ActionsCopy link #1 [ruby-core:107601]

Updated by Eregon (Benoit Daloze) almost 4 years ago ActionsCopy link #2 [ruby-core:107602]

Updated by shyouhei (Shyouhei Urabe) almost 4 years ago ActionsCopy link #3 [ruby-core:107608]

Updated by midnight (Sarun R) almost 4 years ago Actions
Copy link
#1 [ruby-core:107601]

Updated by Eregon (Benoit Daloze) almost 4 years ago Actions
Copy link
#2 [ruby-core:107602]

Updated by shyouhei (Shyouhei Urabe) almost 4 years ago Actions
Copy link
#3 [ruby-core:107608]