Feature #10096

[PATCH] use khash for fstring and id_str tables

Added by Eric Wong 9 months ago. Updated 8 months ago.

[ruby-core:64041]
Status:Assigned
Priority:Normal
Assignee:Koichi Sasada

Description

frozen_strings and global_symbols.id_str hashes are two of the bigger
hashes in Ruby. They are needlessly ordered and incurs malloc overhead
in every st_table_entry.

Use an unordered open-addressing table which incurs no additional malloc
overhead besides the (larger) table itself.

Reduces "ruby -e exit" (w/RubyGems) by ~200K on eglibc malloc on
amd64 Debian stable due to having fewer allocations

global_symbols.str_id is left unchanged (for now) because it is used for
Symbol.all_symbols where ordering is expected

This introduces no user-visible changes or incompatibility
(unless I added a bug :x).

I chose khash because it is flexible and has (IMHO) a good API.
The API should also be familiar to mruby hackers, as mruby uses
a version of khash.

Future changes:

  • covert smaller internal hashes where ordering is not exposed to Ruby users
  • (hopefully) other hashes (methods/constants/ivars) hashes [Feature #9614]

(Note: tried a few times in lynx with 503 errors, never had this problem before
in lynx, trying clunky browser now)

khash.patch Magnifier (35.5 KB) Eric Wong, 07/26/2014 07:52 AM

khash-fstring-v3.patch Magnifier (28.4 KB) Eric Wong, 09/04/2014 09:28 PM

History

#1 Updated by Eric Wong 9 months ago

normalperson@yhbt.net wrote:

Feature #10096: [PATCH] use khash for fstring and id_str tables
https://bugs.ruby-lang.org/issues/10096

Updated for latest symbol.c changes in r46972

http://80x24.org/khash_v2.patch

P.S.: hsbt (or anybody else): Redmine gave me a 503 error when
attempting to upload the above patch (just a few minutes ago).
I also had problems last week, but uploading patches seems OK
on other tickets using lynx.

Client IP is 64.71.152.64 (dcvr.yhbt.net / 80x24.org server)

#2 Updated by Koichi Sasada 9 months ago

Eric Wong wrote:

frozen_strings and global_symbols.id_str hashes are two of the bigger
hashes in Ruby. They are needlessly ordered and incurs malloc overhead
in every st_table_entry.

We will change global_symbols.id_str (simple array) soon.
So do not touch it.

ID is now includes serial number. So use it.
But there are several issues to apply this simple approach.
For example, some IDs share same serial number.
But we (Nobu and I) discussed to solve them, and made up solutions.

Use an unordered open-addressing table which incurs no additional malloc
overhead besides the (larger) table itself.

Reduces "ruby -e exit" (w/RubyGems) by ~200K on eglibc malloc on
amd64 Debian stable due to having fewer allocations

Please write "from which size".
100MB -> (100MB-200KB) is trivial. 300KB->100KB is great work.

I chose khash because it is flexible and has (IMHO) a good API.
The API should also be familiar to mruby hackers, as mruby uses
a version of khash.

Did you check performance?
I know mruby uses it (and maybe more products).
So I assume it is sophisticated. How do you feel?

#3 Updated by Eric Wong 9 months ago

ko1@atdot.net wrote:

Eric Wong wrote:

frozen_strings and global_symbols.id_str hashes are two of the bigger
hashes in Ruby. They are needlessly ordered and incurs malloc overhead
in every st_table_entry.

We will change global_symbols.id_str (simple array) soon.
So do not touch it.

ID is now includes serial number. So use it.
But there are several issues to apply this simple approach.
For example, some IDs share same serial number.
But we (Nobu and I) discussed to solve them, and made up solutions.

OK, I look forward to your changes.

Use an unordered open-addressing table which incurs no additional malloc
overhead besides the (larger) table itself.

Reduces "ruby -e exit" (w/RubyGems) by ~200K on eglibc malloc on
amd64 Debian stable due to having fewer allocations

Please write "from which size".
100MB -> (100MB-200KB) is trivial. 300KB->100KB is great work.

The results go from around ~9300KB -> ~9100KB or so. I think
the variation is due to hash randomization, but there may be
other factors.

It used to be in the ~7300KB range a few months ago but I think GC
changes increased usage of initial startup...

I chose khash because it is flexible and has (IMHO) a good API.
The API should also be familiar to mruby hackers, as mruby uses
a version of khash.

Did you check performance?
I know mruby uses it (and maybe more products).
So I assume it is sophisticated. How do you feel?

Our benchmark suite doesn't reveal much:
http://80x24.org/bmlog-20140803-231204.7455

Adding an unrealistic micro-bench shows st is faster:
http://80x24.org/fstring_iter_st.patch
http://80x24.org/fstring_iter_khash.patch

$ FSTRING_ITER=1000000 /usr/bin/time ./ruby.st -e exit
1.30user 0.00system 0:01.30elapsed 99%CPU (0avgtext+0avgdata 9240maxresident)k
0inputs+0outputs (0major+1348minor)pagefaults 0swaps

$ FSTRING_ITER=1000000 /usr/bin/time ./ruby.khash -e exit
1.60user 0.00system 0:01.61elapsed 99%CPU (0avgtext+0avgdata 9108maxresident)k
0inputs+0outputs (0major+1306minor)pagefaults 0swaps

Memory usage is from fstring change only, id_str untouched.

I don't think the performance is noticeable in real world, and the
memory/pagefault reduction is not big, either.

Your call :)

#4 Updated by Hiroshi SHIBATA 9 months ago

  • Status changed from Open to Assigned

#5 Updated by Koichi Sasada 8 months ago

Sorry for delay.

I read your patch, and I'm not sure it is good idea to make klib directory.
This is only one thing I'm afraid.

#6 Updated by Eric Wong 8 months ago

updated patch:
* khash.h moved to top-level, klib/ removed
* updated for recent fstring-related changes in string.c

#7 Updated by Eric Wong 8 months ago

normalperson@yhbt.net wrote:

File khash-fstring-v3.patch added

OK to commit before 9/14? I think khash is the best for fstring
as we only store one VALUE (no extra allocation)

  • covert smaller internal hashes where ordering is not exposed to Ruby users
  • (hopefully) other hashes (methods/constants/ivars) hashes [Feature #9614]

Maybe ihash is better for method entries and constant entries because
they require a struct allocation.

I'm refreshing that series for methods + constants with ordering as a
split-out patch. We may keep ordering (at memory cost) since ccan/list
makes it easy and we still save memory from avoiding st_table_entry
allocations.

I'm not going to change ivars for now.

Also available in: Atom PDF