Project

General

Profile

Bug #13343

Improve Hash#merge performance

Added by watson1978 (Shizuo Fujita) over 3 years ago. Updated about 3 years ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Target version:
-
[ruby-dev:50026]

Description

Hash#merge will be faster around 60%.

Before

                 user     system      total        real
Hash#merge   0.160000   0.020000   0.180000 (  0.182357)

After

                 user     system      total        real
Hash#merge   0.110000   0.010000   0.120000 (  0.114404)

Test code

require 'benchmark'

Benchmark.bmbm do |x|
  hash1 = {}
  100.times { |i| hash1[i.to_s] = i }
  hash2 = {}
  100.times { |i| hash2[(i*2).to_s] = i*2 }

  x.report "Hash#merge" do
    10000.times do
      hash1.merge(hash2)
    end
  end
end

Patch

The patch is in https://github.com/ruby/ruby/pull/1533

Updated by normalperson (Eric Wong) over 3 years ago

watson1978@gmail.com wrote:

https://bugs.ruby-lang.org/issues/13343

Hash#merge will be faster around 60%.

+Cc ruby-core, since your post was English (and I don't read Japanese)

This is promising!

The patch is in https://github.com/ruby/ruby/pull/1533

We need to check for redefinition of initialize_dup and
initialize_copy methods in Hash for this to be correct.

Unfortunately for people optimizing Ruby, corner-case
redefinition checks are probably necessary :<

Also, I wonder if we can improve rb_funcall to better support
inline caching. rb_funcall API is also bad since it cannot use
inline cache for method lookup. Maybe a better C API can be
introduced for faster function calls from C.

Note: I checked commit c5d74afdb4cfea2a4c9ff432d9da82f0649a1e67
by having a "fetch = +refs/pull/:refs/remotes/ruby/pull/"
line in a "remote" section of my .git/config. I did not
use any proprietary API or JavaScript to view your changes.

Updated by normalperson (Eric Wong) over 3 years ago

watson1978@gmail.com wrote:

https://bugs.ruby-lang.org/issues/13343

Hash#merge will be faster around 60%.

+Cc ruby-core, since your post was English (and I don't read Japanese)

This is promising!

The patch is in https://github.com/ruby/ruby/pull/1533

We need to check for redefinition of initialize_dup and
initialize_copy methods in Hash for this to be correct.

Unfortunately for people optimizing Ruby, corner-case
redefinition checks are probably necessary :<

Also, I wonder if we can improve rb_funcall to better support
inline caching. rb_funcall API is also bad since it cannot use
inline cache for method lookup. Maybe a better C API can be
introduced for faster function calls from C.

Note: I checked commit c5d74afdb4cfea2a4c9ff432d9da82f0649a1e67
by having a "fetch = +refs/pull/:refs/remotes/ruby/pull/"
line in a "remote" section of my .git/config. I did not
use any proprietary API or JavaScript to view your changes.

Updated by watson1978 (Shizuo Fujita) over 3 years ago

I followed the behavior of Array's methods such as

VALUE
rb_ary_sort(VALUE ary)
{
    ary = rb_ary_dup(ary);

It does not check whether initialize_dup/initialize_copy were overridden.

#4

Updated by watson1978 (Shizuo Fujita) about 3 years ago

  • Status changed from Open to Closed

Applied in changeset trunk|r58811.


Improve Hash#merge performance

  • hash.c (rb_hash_merge): use rb_hash_dup() instead of rb_obj_dup() to duplicate
    Hash object. rb_hash_dup() is faster duplicating function for Hash object
    which got rid of Hash#initialize_dup method calling.

    Hash#merge will be faster around 60%.
    [ruby-dev:50026] [Bug #13343] [Fix GH-1533]

Before

             user     system      total        real

Hash#merge 0.160000 0.020000 0.180000 ( 0.182357)

After

             user     system      total        real

Hash#merge 0.110000 0.010000 0.120000 ( 0.114404)

Test code

require 'benchmark'

Benchmark.bmbm do |x|
hash1 = {}
100.times { |i| hash1[i.to_s] = i }
hash2 = {}
100.times { |i| hash2[(i*2).to_s] = i*2 }

x.report "Hash#merge" do
10000.times do
hash1.merge(hash2)
end
end
end

Also available in: Atom PDF