Feature #9638

[PATCH] limit IDs to 32-bits on 64-bit systems

Added by Eric Wong 12 months ago. Updated 9 months ago.

[ruby-core:61496]
Status:Rejected
Priority:Low
Assignee:-

Description

This should allow better use of cache-friendly lookup mechanisms such as
funny_falcon's sparse array in

Also limits symbol space to prevent OOM.

Some structs may also be made smaller as a result (rb_method_entry_t).

We're changing ABI for 2.2.0 anyways, so this is a good time to introduce
this change.

0001-ID-is-always-uint32_t.patch Magnifier (3.62 KB) Eric Wong, 03/14/2014 07:06 PM

History

#1 Updated by Eric Wong 12 months ago

sparse array is described in ruby-core:55079

#2 Updated by Eric Wong 9 months ago

I'm not sure if this is possible anymore due to SymbolGC
No big deal, though.

#3 Updated by Naohisa Goto 9 months ago

I'm using machines that have 2TB or more main memory. I think the machines can treat more than 2**32 symbols and I want to use full 64-bit capacity.

#4 Updated by Eric Wong 9 months ago

I am OK with closing this issue (but I'm not sure if I have permissions
to close on redmine).

However, your applications need more than 2**32 different symbols?
That scares me :*(
How much memory do your Ruby processes use?

The Symbol table currently takes at least (48 + 48 + 40 = 136) bytes per
symbol on 64-bit, so 136 * (2 ** 32) is 544 gigabytes just for the
symbol table (w/fstrings) in your app. That does not even account for
memory of symbols with string representations longer than 23 bytes,
nor the memory for hash table buckets.

I need to know because I am also looking into using khash[1] for the
symbol table. By default, khash internal buckets/counters are all
32-bits. We can tweak khash to use 64-bit counters if needed,
but 2**32 symbols really should be enough.

The symbol table with khash might reduce memory overhead to ~90 bytes
per-symbol on average, though...

[1] git clone https://github.com/attractivechaos/klib.git
mruby also uses khash for (all?) its hash table needs.

#5 Updated by Eric Wong 9 months ago

  • Status changed from Open to Rejected

#6 Updated by Koichi Sasada 9 months ago

(2014/03/15 4:07), normalperson@yhbt.net wrote:

Also limits symbol space to prevent OOM.

What is OOM?
Out of memory?

Symbol GC doesn't help?

--
// SASADA Koichi at atdot dot net

#7 Updated by Eric Wong 9 months ago

SASADA Koichi ko1@atdot.net wrote:

(2014/03/15 4:07), normalperson@yhbt.net wrote:

Also limits symbol space to prevent OOM.

What is OOM?
Out of memory?

Yes, out-of-memory.

Symbol GC doesn't help?

It does; but OOM was a secondary concern of mine.

I mainly wanted 32-bit ID so it might be easier to pack some structs
on 64-bit machines. 64-bit ID is not a big issue, though.

Also available in: Atom PDF