Project

General

Profile

Actions

Bug #16121

closed

Stop making a redundant hash copy in Hash#dup

Added by dylants (Dylan Thacker-Smith) over 2 years ago. Updated about 2 years ago.

Status:
Closed
Priority:
Normal
Target version:
-
ruby -v:
ruby 2.7.0dev (2019-08-23T16:41:09Z master b38ab0a3a9) [x86_64-darwin18]
[ruby-core:94507]

Description

Problem

I noticed while profiling object allocations that Hash#dup was allocating 2 objects instead of only 1 as expected. I looked for alternatives for comparison and found that Hash[hash] created a copy with only a single object allocation and seemed to be more than twice as fast. Reading the source code revealed the difference was that Hash#dup creates a copy of the Hash, then rehashes the copy. However, rehashing is done by making a copy of the hash, so the first copy before rehashing was unnecessary.

Solution

I changed the code to just use rehashing to make the copy of the hash to improve performance while also preserving the existing behaviour.

Benchmark

require 'benchmark'

N = 100000

def report(x, name)
  x.report(name) do
    N.times do
      yield
    end
  end
end

hashes = {
  small_hash: { a: 1 },
  larger_hash: 20.times.map { |i| [('a'.ord + i).chr.to_sym, i] }.to_h
}

Benchmark.bmbm do |x|
  hashes.each do |name, hash|
    report(x, "#{name}.dup") do
      hash.dup
    end
  end
end

results on master

                      user     system      total        real
small_hash.dup    0.401350   0.001638   0.402988 (  0.404608)
larger_hash.dup   7.218548   0.433616   7.652164 (  7.695990)

results with the attached patch

                      user     system      total        real
small_hash.dup    0.336733   0.002425   0.339158 (  0.341760)
larger_hash.dup   6.617343   0.398407   7.015750 (  7.070282)

Files

0001-Remove-redundant-Check_Type-after-to_hash.diff.txt (624 Bytes) 0001-Remove-redundant-Check_Type-after-to_hash.diff.txt [PATCH 1/4] Remove redundant Check_Type after to_hash dylants (Dylan Thacker-Smith), 08/23/2019 07:55 PM
0002-Fix-freeing-and-clearing-destination-hash-in-Hash.diff.txt (1.57 KB) 0002-Fix-freeing-and-clearing-destination-hash-in-Hash.diff.txt [PATCH 2/4] Fix freeing and clearing destination hash in Hash#initialize_copy dylants (Dylan Thacker-Smith), 08/23/2019 07:55 PM
0003-Remove-dead-code-paths-in-rb_hash_initialize_copy.diff.txt (1.12 KB) 0003-Remove-dead-code-paths-in-rb_hash_initialize_copy.diff.txt [PATCH 3/4] Remove dead code paths in rb_hash_initialize_copy dylants (Dylan Thacker-Smith), 08/23/2019 07:55 PM
0004-Stop-making-a-redundant-hash-copy-in-Hash-dup.diff.txt (1.35 KB) 0004-Stop-making-a-redundant-hash-copy-in-Hash-dup.diff.txt [PATCH 4/4] Stop making a redundant hash copy in Hash#dup dylants (Dylan Thacker-Smith), 08/23/2019 07:55 PM
Actions

Also available in: Atom PDF