Bug #13019
closedFix st_hash* functions
Description
Previous implementation had an issues:
- macros murmur1 assumes murmur_step takes rotation value
as a second argument - but murmur_step second argument is "next block"
- this makes st_hash_uint and st_hash_end to not mix high bits of
hash value into lower bits - this leads to pure hash behavior on doubles and mixing hashes using st_hash_uint.
It didn't matter when bins amount were prime numbers, but it hurts
when bins are powers of two.
Mistake were created cause of attempt to co-exist Murmur1 and Murmur2
in a same code.
Change it to single hash-function implementation.
- block function is in a spirit of Murmur functions,
but handles inter-block dependency a bit better (imho). - final block is read in bit more optimal way on CPU with unaligned word access,
- final block is mixed in simple way,
- finalizer is taken from MurmurHash3 (it makes most of magic :) )
(64bit finalizer is taken from
http://zimbry.blogspot.ru/2011/09/better-bit-mixing-improving-on.html)
Also remove ST_USE_FNV1: it lacks implementation of many functions,
and looks to be abandoned
Files
Updated by vmakarov (Vladimir Makarov) over 7 years ago
Yura Sokolov wrote:
Previous implementation had an issues:
- macros murmur1 assumes murmur_step takes rotation value
as a second argument- but murmur_step second argument is "next block"
- this makes st_hash_uint and st_hash_end to not mix high bits of
hash value into lower bits- this leads to pure hash behavior on doubles and mixing hashes using st_hash_uint.
It didn't matter when bins amount were prime numbers, but it hurts
when bins are powers of two.
Although performance of MRI hash benchmarks are not improved by this patch, I confirm a bad behaviour of the current variant of murmur hash in MRI. A half year ago, I ran it on SMhasher (https://github.com/aappleby/smhasher) and it failed on a few tests although murmur hashes in SMHasher itself passed the tests. So it is better to fix the bad quality for murmur hash in MRI. Sorry, I did not check your variant in SMHasher. I think it would be nice to do this.
Updated by funny_falcon (Yura Sokolov) over 7 years ago
It passes SMHasher (at least, little-endian variant).
Updated by shyouhei (Shyouhei Urabe) over 7 years ago
- Status changed from Open to Assigned
- Assignee set to nobu (Nobuyoshi Nakada)
Updated by nobu (Nobuyoshi Nakada) over 7 years ago
- Status changed from Assigned to Closed
Applied in changeset r57134.
st.c: fix st_hash* functions [Bug #13019]
Previous implementation had an issues:
- macros murmur1 assumes murmur_step takes rotation value
as a second argument - but murmur_step second argument is "next block"
- this makes st_hash_uint and st_hash_end to not mix high bits of
hash value into lower bits - this leads to pure hash behavior on doubles and mixing hashes using
st_hash_uint.
It didn't matter when bins amount were prime numbers, but it hurts
when bins are powers of two.
Mistake were created cause of attempt to co-exist Murmur1 and Murmur2
in a same code.
Change it to single hash-function implementation.
- block function is in a spirit of Murmur functions,
but handles inter-block dependency a bit better (imho). - final block is read in bit more optimal way on CPU with unaligned word access,
- final block is mixed in simple way,
- finalizer is taken from MurmurHash3 (it makes most of magic :) )
(64bit finalizer is taken from
http://zimbry.blogspot.ru/2011/09/better-bit-mixing-improving-on.html)
Also remove ST_USE_FNV1: it lacks implementation of many functions,
and looks to be abandoned
Author: Sokolov Yura aka funny_falcon funny.falcon@gmail.com