Feature #9614
openordering of non-Hash items which use st_ internally
Description
Hi matz, I would like your permission to remove the order preservation from
any or all of the following currently implemented using st_table
:
- method tables
- global symbols (
Symbol.all_symbols
) - constant tables
- instance variable tables
-
global_variables
method Thread#keys
- anything besides the
Hash
class
I am currently working on a patch series to reduce internal memory usage,
so far I have only converted three pieces:
- method tables (~200K reduction)
- symbol table (
global_symbols.
{id_str
,sym_id
}) (~200K) -
frozen_strings
(~100K)
n.b. frozen_strings
ordering is never exposed to users, so I expect
it to be OK.
Memory reduction is just based on "ruby -e exit
" (which loads RubyGems);
bigger programs with more methods/symbols will save more memory.
Work-in-progress patches attached (0002 describes implementation details)
Files
Updated by normalperson (Eric Wong) over 10 years ago
normalperson@yhbt.net wrote:
Hi matz, I would like your permission to remove the order preservation from
any or all of the following currently implemented usingst_table
:
- method tables
- global symbols (
Symbol.all_symbols
)- constant tables
- instance variable tables
global_variables
methodThread#keys
- anything besides the
Hash
class
matz: ping?
Updated by ko1 (Koichi Sasada) over 10 years ago
Just a comment.
I think it should be another parameter of st_table.
Now all st_tables have order.
But most of case, it doesn't needed.
I think it is fine for me that st_table user can select.
Updated by nobu (Nobuyoshi Nakada) over 10 years ago
- Description updated (diff)
How is it the rate of reduction?
Since I wasn't very positive to make the ordering a spec, I'm not against it.
But it doesn't feel nice to have many similar mechanisms, too.
Updated by nobu (Nobuyoshi Nakada) over 10 years ago
I suspect that tests for local_variables
would need fixing too.
Updated by normalperson (Eric Wong) over 10 years ago
Hi, thank you for the comments!
ko1: adding flag for ordering might complicate the st code even more.
I think st should only be for implementing Hash.
My primary goals for ihash are to reduce pointer chasing and malloc use.
ihash uses container_of, like ccan/list:
st (unpacked entries):
st_table -> st_table->bins -> st_table_entry -> data structure
st (packed entries):
st_table -> st_table->packed.entries -> data structure
ihash:
rb_ihash_table -> data structure (via container_of)
Removing ordering requirment allows us to avoid writing/maintaining more
(new) code and also save two pointers (fore/prev).
nobu: reduction depends on what is stored
Best case so far are method table and symbol table:
For each method entry, we save:
sizeof(st_table_entry) - sizeof(rb_ihash_node) + malloc_overhead
On 64-bit with glibc malloc: 48 - 8 + 16 = 56 bytes saved.
AFAIK, jemalloc has less overhead for small allocations, but even
without malloc overhead, saving 40 bytes per method entry is great.
Symbol table is a big win, too, one single struct has membership
in both sym_id and id_str tables:
struct rb_idsym {
struct rb_ihash_node id_str_node; /* sizeof(void *) */
ID id;
struct rb_ihash_node sym_id_node;
VALUE symstr;
st_index_t hashval;
};
On 64-bit:
Before: 48 + 48 = 96 (+ 32 bytes on glibc malloc)
After: 40 (+ 16 bytes on glibc malloc)
If we add ordering to Symbol.all_symbols, we only need one pair
of list pointers and not two. But I prefer to not need any :)
Worst case is only saving two pointers (ivar table); but I may look at
funny-falcon's sparse array for the ivar table. I have not looked at
the local variable table, but it may be in the same class as ivar
tables and be better with a sparse array.
P.S.: I am also considering truncating rb_idsym.hashval to use as ID.
This will:
a) reduce per-symbol overhead by sizeof(st_index_t)
b) (hopefully) improve distribution for global method cach
Measuring the effect of b) may be hard, though.
Updated by ko1 (Koichi Sasada) over 10 years ago
ko1: adding flag for ordering might complicate the st code even more.
I think st should only be for implementing Hash.
I agree that st.c is already complicated.
But I think Hash class should use original one.
Adding similar data structure can also increase complexity.
At least, I like same interfaces for such similar data structures.
My primary goals for ihash are to reduce pointer chasing and malloc use.
ihash uses container_of, like ccan/list:
It makes sense.
I think open addressing is better for such internal, small tables such as method table.
It doesn't need chain buckets.
So I wanted to extend st to introduce open addressing mode.
Updated by normalperson (Eric Wong) over 10 years ago
ko1@atdot.net wrote:
Adding similar data structure can also increase complexity.
At least, I like same interfaces for such similar data structures.
Right, I try to make ihash API like st (insert/update/foreach/lookup).
However, there must be some differences due to data layout changes.
e.g. delete could be implemented without a callback, so I call the new
operation "unlink" instead. Maybe "delink" is a less ambiguous name...
I think open addressing is better for such internal, small tables such as method table.
It doesn't need chain buckets.So I wanted to extend st to introduce open addressing mode.
Open addressing will probably work better for instance/local var tables
with simple ID->(VALUE|long|ID) mapping (no struct behind the data).
For larger structures like rb_method_entry/rb_const_entry, I think using
container_of-based storage is best. Open addressing may not make sense
when using container_of, chaining was simpler and easier to match what
st already does, so I chose chaining for ihash.
Updated by normalperson (Eric Wong) over 10 years ago
I forget, ordering is easy to add to ihash with ccan/list.
Work-in-progress, this is only for method entries with ihash ordered
doing "ruby -e exit" (loading rubygems)
Numbers from valgrind, so "bytes allocated" does not take into account malloc
overhead of particular malloc implementations:
st (original):
total heap usage: 48,119 allocs, 19,169 frees, 8,106,443 bytes allocated
ihash-unordered
total heap usage: 45,571 allocs, 19,501 frees, 8,038,885 bytes allocated
ihash-ordered:
total heap usage: 45,571 allocs, 19,501 frees, 8,089,941 bytes allocated
The reduction in overall malloc/free calls is nice; but bytes allocated
is small because we're dealing with small elements. This makes method
entries much bigger (1 pointer for hash chaining, 2 pointers for
list_node), but reduces the need to make separate allocations for
st_table_entry.
-
http://bogomips.org/ruby.git/patch/?id=3930d8172dea41cd
ihash: initial implementation -
http://bogomips.org/ruby.git/patch/?id=02334f26a0c2a1f5
convert method entries to unordered ihash -
http://bogomips.org/ruby.git/patch/?id=fcf8e764bcd5a1c1
preserve ordering in method entries
git clone git://bogomips.org/ruby.git branch=ihash7
(I also started working on constants, but haven't added ordering, yet).
Running benchmark suite now, I expect uncached method entries for larger
classes/modules should be faster due to reduced indirection as described
in [ruby-core:62578]
For small (currently <=5) element tables, it is like st-packed and does a
linear search (which will cross cache lines, unfortunately)
Updated by hsbt (Hiroshi SHIBATA) 10 months ago
- Status changed from Open to Assigned