Project

General

Profile

Actions

Feature #20470

open

Extract Ruby's Garbage Collector

Added by peterzhu2118 (Peter Zhu) 15 days ago. Updated 1 day ago.

Status:
Open
Assignee:
-
Target version:
-
[ruby-core:117765]

Description

Extract Ruby's Garbage Collector

Background

As described in [Feature #20351], we are working on the ability to plug alternative garbage collector implementations into Ruby. Our goal is to allow developers and researchers to create and experiment with new implementations of garbage collectors in Ruby in a simplified way. This will also allow experimentation with different GC implementations in production systems so users can choose the best GC implementation for their workloads.

Implementation

GitHub PR: 10721

In this patch, we have split the current gc.c file into two files: gc.c and gc_impl.c.

gc.c now only contains code not specific to Ruby GC. This includes code to mark objects (which the GC implementation may choose not to use) and wrappers for internal APIs that the implementation may need to use (e.g. locking the VM).

gc_impl.c now contains the implementation of Ruby's GC. This includes marking, sweeping, compaction, and statistics. Most importantly, gc_impl.c only uses public APIs in Ruby and a limited set of functions exposed in gc.c. This allows us to build gc_impl.c independently of Ruby and plug Ruby's GC into itself.

Demonstration

After checking out the branch, we can first configure with --with-shared-gc:

$ ./configure --with-shared-gc
...
$ make -j
...

Let's now change the slot size of the GC to 64 bytes:

$ sed -i 's/\(#define BASE_SLOT_SIZE\).*/\1 64/' gc_impl.c

We can compile gc_impl.c independently using the following commands for clang or gcc (you may have to change the last -I to match your architecture and platform):

$ clang -Iinclude -I. -I.ext/include/arm64-darwin23 -undefined dynamic_lookup -g -O3 -dynamiclib -o libgc.dylib gc_impl.c
$ gcc -Iinclude -I. -I.ext/include/x86_64-linux -Wl,-undefined,dynamic_lookup -fPIC -g -O3 -shared -o libgc.so gc_impl.c

We can see that by default, the slot size is 40 bytes and objects are 40 bytes in size:

$ ./ruby -e "puts GC.stat_heap(0, :slot_size)"
40
$ ./ruby -robjspace -e "puts ObjectSpace.dump(Object.new)"
{"address":"0x1054a23f0", "type":"OBJECT", "shape_id":3, "slot_size":40, "class":"0x10528fd38", "embedded":true, "ivars":0, "memsize":40, "flags":{"wb_protected":true}}

We can now load our new GC using the RUBY_GC_LIBRARY_PATH environment variable (note that you may have to change the path to the DSO):

$ RUBY_GC_LIBRARY_PATH=./libgc.dylib ./ruby -e "puts GC.stat_heap(0, :slot_size)"
64
$ RUBY_GC_LIBRARY_PATH=./libgc.dylib ./ruby -robjspace -e "puts ObjectSpace.dump(Object.new)"
{"address":"0x1038de440", "type":"OBJECT", "shape_id":3, "slot_size":64, "class":"0x10355fc00", "embedded":true, "ivars":0, "memsize":64, "flags":{"wb_protected":true}}

Benchmark

Benchmarks were ran on commit c78cebb on Ubuntu 22.04 using yjit-bench on commit cc5a76e.

Compiling gc_impl branch without --with-shared-gc (i.e. how the default Ruby is built), the benchmarks show little to no decrease in performance, with most of it being 0% to 1% slower:

--------------  -----------  ----------  ------------  ----------  ---------------  --------------
bench           master (ms)  stddev (%)  gc_impl (ms)  stddev (%)  gc_impl 1st itr  master/gc_impl
activerecord    73.9         0.3         74.6          0.3         1.00             0.99          
chunky-png      911.1        0.2         937.2         0.2         0.97             0.97          
erubi-rails     1582.4       0.1         1583.5        0.0         1.00             1.00          
hexapdf         2716.2       1.1         2760.2        0.7         1.00             0.98          
liquid-c        68.9         0.5         68.6          0.4         1.00             1.00          
liquid-compile  67.9         0.1         68.2          0.2         0.99             1.00          
liquid-render   172.8        0.1         174.9         0.1         0.99             0.99          
lobsters        1033.9       0.4         1036.0        0.3         1.08             1.00          
mail            135.1        0.2         136.5         0.2         0.99             0.99          
psych-load      2250.8       0.1         2274.9        0.3         0.99             0.99          
railsbench      2499.2       0.2         2502.9        0.1         1.00             1.00          
rubocop         178.3        0.5         179.8         0.4         1.00             0.99          
ruby-lsp        116.8        0.1         118.5         0.2         1.00             0.99          
sequel          75.4         0.2         76.2          0.3         0.99             0.99          
--------------  -----------  ----------  ------------  ----------  ---------------  --------------

Compiling gc_impl branch with --with-shared-gc and loading Ruby's current GC using RUBY_GC_LIBRARY_PATH, the benchmarks are still fairly good with performance decrease of only around 1% to 2%:

--------------  -----------  ----------  ------------  ----------  ---------------  --------------
bench           master (ms)  stddev (%)  gc_impl (ms)  stddev (%)  gc_impl 1st itr  master/gc_impl
activerecord    74.2         0.2         75.4          0.5         0.98             0.98          
chunky-png      916.3        0.3         933.2         0.1         0.98             0.98          
erubi-rails     1597.6       0.1         1586.3        0.2         1.01             1.01          
hexapdf         2731.4       0.5         2776.8        0.7         1.00             0.98          
liquid-c        68.5         0.1         68.9          0.4         0.97             0.99          
liquid-compile  67.4         0.4         68.3          0.2         0.95             0.99          
liquid-render   171.8        0.1         175.6         0.2         0.97             0.98          
lobsters        1031.9       0.3         1041.4        0.3         0.94             0.99          
mail            135.5        0.4         136.7         0.1         0.99             0.99          
psych-load      2246.0       0.1         2281.3        0.1         0.99             0.98          
railsbench      2490.9       0.0         2490.0        0.1         1.01             1.00          
rubocop         179.8        2.3         180.0         0.4         0.94             1.00          
ruby-lsp        117.3        0.1         118.5         0.1         0.99             0.99          
sequel          75.8         0.5         76.3          0.2         0.99             0.99          
--------------  -----------  ----------  ------------  ----------  ---------------  --------------

Limitations

We recognize that our current implementation does not yet offer the flexibility required for a generic plug-in GC. Specifically, the set of APIs that the plug-in GC has to implement is relatively large, at around 70 functions. Additionally, some of these functions are specific to the current GC.

We would like to emphasize that the API is NOT stable and is subject to change. We will be working on improving this API and reducing the surface area. This will be future work and we're not working on it in this phase.

Future plans

  • Refactor and improve gc_impl.c.
  • Implement alternate GC implementations, such as the Epsilon GC and MMTk to prove that this API allows for alternate implementations of the GC.
  • Reduce and improve the API of the GC implementation.
  • Benchmark and improve performance of the DSO API.

Updated by nobu (Nobuyoshi Nakada) 4 days ago

My concern is that a part of this feature, adding a new environment variable that loads a shared object implicitly, can cause security issues.
LD_PRELOAD, DYLD_INSERT_LIBRARIES and so on have the same risk, so dynamic linkers remove such variables in suid processes at least.
However it would be difficult to let all necessary people to know a new variable.
So I'd be negative to allow this in production.

Updated by peterzhu2118 (Peter Zhu) 4 days ago

I think the risk is fairly low, since the user has to compile Ruby with --with-shared-gc to enable this feature. Since they have to manually enable this feature at compile time, they should be aware of the possible security issues that comes with it.

If we want to completely mitigate this risk, we could instead use a command line argument rather than an environment variable.

Updated by katei (Yuta Saito) 1 day ago ยท Edited

+1 for extracting GC implementation of gc.c into a separate gc_impl.c file.

My motivation: Some of the use cases of Ruby on Wasm (e.g. edge computing platform) do not actually need to collect garbage objects during execution because such a process does not last so long. Splitting out the collector implementation allows us to have light-weight implementation like no-op GC in JVM. It will be a good fit for such use cases to reduce runtime overhead and program size, and also be helpful for performance analysis too.

Actions

Also available in: Atom PDF

Like0
Like0Like0Like0