Feature #12236
closedIntroduce `mmap` managed heap
Description
Abstract¶
I propose mmap
(or similar API on Windows) managed heap.
Now, each heap pages are allocated by posix_memalign(), 16KB for each pages.
I wrote a patch to manage heap pages allocated by bulk mmap
(256MB),
and provide pages by 4KB each pages.
This technique can reduce several overhead,
and can reduce fragmentation problem by reducing page size.
- https://github.com/ruby/ruby/compare/trunk...ko1:heap_arena
- https://github.com/ruby/ruby/compare/trunk...ko1:heap_arena.patch
- https://github.com/ruby/ruby/compare/trunk...ko1:heap_arena.diff
Terminology¶
- Slot: Object representation. Each object consumes a slot. Now, a slot consumes 5 words.
- Page: A set of slots. Now, page size is 16KB (so that 400 slots available in a page on 64bit CPU)
- Heap: A set of pages. Now, there is only one heap.
Background¶
Fragmentation¶
Before Ruby 1.9, the page size was very big (and only few pages).
It was easy to manage, but we couldn't free pages because of fragmentation problem.
To relax this problem, from Ruby 1.9, we split pages into small fixed size (16KB).
The possibility to release pages are increased,
but there are several overhead to manage huge number of pages.
We introduced sorted list to manage such many pages.
This list is used by conservative marking
(because we need to know that a pointer points a slot or not).
Alignment¶
Ruby 2.0.0 introduced a bitmap marking technique, to improve CoW friendliness.
To implement this technique, we need to align each pages to access corresponding bitmaps.
So we have used posix_memalign(). This is why
Proposal¶
We propose mmap
managed heap.
Allocate a big area (call it arena) and provide pages from an arena.
Only mmap
, we only reserve virtual memory, not acquire physical memory.
We can use madvise
with MADV_DONTNEED
or MADV_FREE
to return a page to OS.
Redefined words:
- New page: A set of slots. Now, page size is 4KB (so that 100 slots available in a page on 64bit CPU)
- Arena: a set of pages. Now, arena size is 256MB (so that 65536 pages in an arena)
- Heap: a set of arenas.
Discussion¶
- Advantages:
- We don't need to manage sorted_list.
- We can manage small size pages to improve empty possibility.
- Easy to search bitmap bits because we can manage bitmaps per arenas.
- Disadvantages
- We need to manage x4 more pages. We need to avoid O(n) (n: page number) operations from the GC process.
- Difficult to port on a system which doesn't have
mmap
(or similar API) - We consume huge virtual memory (at least 256MB). But anybody worries about that?
Optimizations¶
To avoid O(n) operations, we introduced a generational technique for sweeping.
So that we can say "generational sweeping".
The technique is simple. If a page has only old generational slots,
then we can skip sweeping on such page.
We separate pages into "collectible" pages and "uncollectible" pages,
and sweep only "collectible" pages.
Evaluation¶
Actually, we can't observe impressive result on micro-benchmarks and my small rails application.
However, we believe that this technique will help future optimizations (such as separated heaps by types or lifetime, and so on).
Any comments?