Project

General

Profile

Actions

Bug #7166

closed

Speed up Hash#dup by patching Hash#initialize_copy

Added by tenderlovemaking (Aaron Patterson) over 11 years ago. Updated over 11 years ago.

Status:
Closed
Target version:
-
ruby -v:
ruby 2.0.0dev (2012-10-15 trunk 37193) [x86_64-darwin12.2.0]
Backport:
[ruby-core:48009]

Description

Hash#dup can be sped up. Hash#initialize_copy will iterate over each pair in the hash, inserting in to the new hash. I think we can speed up hash duping by using st_copy and copying the underlying hash table.

Here is the benchmark I've been using:

https://gist.github.com/3893852

When you pass a hash to Hash.[], it just uses st_copy to copy the hash. If you run the benchmark, you'll see a fairly large difference between the performance of using Hash#dup and copying the hash via Hash.[].

I've attached a patch that changes Hash#initialize_copy to use st_copy. Here is a plot of the performance difference:

http://i.imgur.com/ai9Am.png

The blue line is the old Hash#dup, the green line is Hash#dup after my patch is applied, and the red line is copying via Hash.[].

I'm not 100% confident in this patch, so I hope someone can review more closely before applying (or rejecting!). Thanks.


Files

hashdup.diff (1.61 KB) hashdup.diff tenderlovemaking (Aaron Patterson), 10/16/2012 02:37 AM

Updated by Eregon (Benoit Daloze) over 11 years ago

The gain is not so big at me, but it's clearly there (I get 1.4 to 1.6 speed-up).
Note about half of the time is spent in GC.

Updated by matz (Yukihiro Matsumoto) over 11 years ago

  • Assignee set to tenderlovemaking (Aaron Patterson)

Go ahead. If we see the problem, it can be easily reverted.

Matz.

Updated by tenderlovemaking (Aaron Patterson) over 11 years ago

  • Status changed from Open to Closed
  • % Done changed from 0 to 100

Thanks. I've committed this in r37232

Updated by usa (Usaku NAKAMURA) over 11 years ago

  • Status changed from Closed to Assigned

This introduce imcompatiblity.
A testcase of Sets fails.

I've added a new test to test/ruby/test_hash.rb to clarify the problem
at r37238.

Updated by matz (Yukihiro Matsumoto) over 11 years ago

Compatibility is important. Call rehash in initialize_copy and see how performance changes.

Matz.

Updated by tenderlovemaking (Aaron Patterson) over 11 years ago

I added the call to rehash and tested. It's still faster than using rb_hash_replace, so I'll add the fix.

Sorry I broke the tests!

Updated by tenderlovemaking (Aaron Patterson) over 11 years ago

  • Status changed from Assigned to Closed
Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0Like0Like0Like0