Project

General

Profile

Feature #14955

[PATCH] gc.c: use MADV_FREE to release most of the heap page body

Added by normalperson (Eric Wong) 3 months ago. Updated 2 months ago.

Status:
Open
Priority:
Normal
Assignee:
-
Target version:
-
[ruby-core:88242]

Description

gc.c: use MADV_FREE to release most the heap page body

On x86 and x86-64 Linux and FreeBSD (at least), we can release
most of the heap page body (12k of nearly 16k). This is better
than causing malloc fragmentation with free(3) on memaligned areas.

cf. https://sourceware.org/bugzilla/show_bug.cgi?id=14581

Note: memory is memory to madvise(2), regardless of whether it
came from brk(2) or mmap(2); so we expect to be able to madvise
any anonymous segments as long as they're page-aligned.
Allocators configured to file-backed mappings will cause
warnings when $VERBOSE is set.

History

#1 Updated by ko1 (Koichi Sasada) 3 months ago

It causes system call and extra overhead so that I'm not sure it is acceptable.

On previous proposal I could measure some performance down on fine-grain madvise.
https://bugs.ruby-lang.org/issues/12236

I assume that we need to make sequential groups. 16KB (4 pages) group can be enough (but not sure).

#2 [ruby-core:88253] Updated by normalperson (Eric Wong) 3 months ago

ko1@atdot.net wrote:

It causes system call and extra overhead so that I'm not sure
it is acceptable.
On previous proposal I could measure some performance down on
fine-grain madvise.

OK. I am worried about that, too.

For this, I think we can also track age of page in tomb before
deciding to free or madvise. That can also reduce fragmentation
from the "aligned_free && memalign soon-after" case.

https://bugs.ruby-lang.org/issues/12236

I assume that we need to make sequential groups. 16KB (4 pages) group can be enough (but not sure).

Yes, 16KB is tiny; but it also helps reduce pause time on lazy-sweep.
Tradeoffs...

#3 [ruby-core:88342] Updated by ko1 (Koichi Sasada) 3 months ago

off-topic :p:

If we have a lightweight communication channel between user-process and OS, we can choose this technique harder.

https://www.azul.com/files/c4_paper_acm2.pdf propose to add sharing remap table between OS and user process to achieve fine-grained page control.

(but they chosen not-modify linux after that)

For example, if user process manage "free-able pages" list in user-process (register this page map by a system call at once) and OS can check it when system memory (OS managed memory) is tight, it is best. of course, this simple model has race issue and some other issues...

#4 [ruby-core:88448] Updated by normalperson (Eric Wong) 2 months ago

https://bugs.ruby-lang.org/issues/14955

One major question to ask is: does object count during
application lifetime vary enough to justify freeing
"struct heap_page_body"?

In my experience, object count is relatively stable
once an application reaches steady state, and we don't
benefit from freeing heap_page_body.

But then, maybe some weird applications allocate many objects at
startup, and steady state object count is much smaller than
startup.

Anyways I made changes in mwrap 2.1.0 to hopefully answer that
question by tracking heap_page_body lifetimes and deathtimes:

http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-talk/439479
https://80x24.org/mwrap-public/20180811043106.GA9571@dcvr/

https://80x24.org/mwrap/Mwrap/HeapPageBody.html

#5 [ruby-core:88468] Updated by ko1 (Koichi Sasada) 2 months ago

On 2018/08/11 13:48, Eric Wong wrote:

https://bugs.ruby-lang.org/issues/14955

One major question to ask is: does object count during
application lifetime vary enough to justify freeing
"struct heap_page_body"?

In my experience, object count is relatively stable
once an application reaches steady state, and we don't
benefit from freeing heap_page_body.
Good question. Applications have their own characteristics of memory
usage and have suitable memory management strategy.

But then, maybe some weird applications allocate many objects at
startup, and steady state object count is much smaller than
startup.

or, special cases (like requests to admin page) can increase objects.

--
// SASADA Koichi at atdot dot net

Also available in: Atom PDF