Project

General

Profile

Actions

Feature #9638

closed

[PATCH] limit IDs to 32-bits on 64-bit systems

Added by normalperson (Eric Wong) about 10 years ago. Updated almost 10 years ago.

Status:
Rejected
Assignee:
-
Target version:
[ruby-core:61496]

Description

This should allow better use of cache-friendly lookup mechanisms such as
funny_falcon's sparse array in [ruby-core:55079]

Also limits symbol space to prevent OOM.

Some structs may also be made smaller as a result (rb_method_entry_t).

We're changing ABI for 2.2.0 anyways, so this is a good time to introduce
this change.


Files

0001-ID-is-always-uint32_t.patch (3.62 KB) 0001-ID-is-always-uint32_t.patch normalperson (Eric Wong), 03/14/2014 07:06 PM

Related issues 1 (0 open1 closed)

Related to Ruby master - Feature #11420: Introduce ID key table into MRIClosedko1 (Koichi Sasada)Actions

Updated by normalperson (Eric Wong) about 10 years ago

sparse array is described in ruby-core:55079

Updated by normalperson (Eric Wong) almost 10 years ago

I'm not sure if this is possible anymore due to SymbolGC
No big deal, though.

Updated by ngoto (Naohisa Goto) almost 10 years ago

I'm using machines that have 2TB or more main memory. I think the machines can treat more than 2**32 symbols and I want to use full 64-bit capacity.

Updated by normalperson (Eric Wong) almost 10 years ago

I am OK with closing this issue (but I'm not sure if I have permissions
to close on redmine).

However, your applications need more than 2**32 different symbols?
That scares me :*(
How much memory do your Ruby processes use?

The Symbol table currently takes at least (48 + 48 + 40 = 136) bytes per
symbol on 64-bit, so 136 * (2 ** 32) is 544 gigabytes just for the
symbol table (w/fstrings) in your app. That does not even account for
memory of symbols with string representations longer than 23 bytes,
nor the memory for hash table buckets.

I need to know because I am also looking into using khash[1] for the
symbol table. By default, khash internal buckets/counters are all
32-bits. We can tweak khash to use 64-bit counters if needed,
but 2**32 symbols really should be enough.

The symbol table with khash might reduce memory overhead to ~90 bytes
per-symbol on average, though...

[1] git clone https://github.com/attractivechaos/klib.git
mruby also uses khash for (all?) its hash table needs.

Updated by normalperson (Eric Wong) almost 10 years ago

  • Status changed from Open to Rejected

Updated by ko1 (Koichi Sasada) almost 10 years ago

(2014/03/15 4:07), wrote:

Also limits symbol space to prevent OOM.

What is OOM?
Out of memory?

Symbol GC doesn't help?

--
// SASADA Koichi at atdot dot net

Updated by normalperson (Eric Wong) almost 10 years ago

SASADA Koichi wrote:

(2014/03/15 4:07), wrote:

Also limits symbol space to prevent OOM.

What is OOM?
Out of memory?

Yes, out-of-memory.

Symbol GC doesn't help?

It does; but OOM was a secondary concern of mine.

I mainly wanted 32-bit ID so it might be easier to pack some structs
on 64-bit machines. 64-bit ID is not a big issue, though.

Actions #8

Updated by ngoto (Naohisa Goto) over 8 years ago

Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0Like0Like0Like0Like0