Project

General

Profile

Feature #16984

Updated by alanwu (Alan Wu) almost 4 years ago

Currently, iclasses are "shady", or not protected by write Consider the following code: 

 ```ruby 
 barriers. Because of that, module M 
   def foo; end 
   def bar; end 
 end 

 class C 
   include M 
 end 
 ``` 

 The object reference graph from running the GC needs to spend more time marking these code looks like this: 

 ``` 
 objects than otherwise. +---+                +-----+ 
 | M |--------------| foo |-+ 
 +---+                +-----+ | 
   |                  +-----+ | 
   +----------------| bar | | 
                    +-----+ | 
 +-----------+           |      | 
 | iclass(M) |---------+      | 
 +-----------+--------------+ 
 ``` 

 Applications that make heavy use of modules should see reduction in GC Applying the proposed patch, the graph becomes 

 ``` 
 time as they have +---+           +--------------+     +-----+ 
 | M |---------| method table |---| foo | 
 +---+           +--------------+     +-----+ 
 +-----------+           |      |       +-----+ 
 | iclass(M) |---------+      +-----| bar | 
 +-----------+                      +-----+ 

 ``` 

 This change has a significant number of live iclasses similar effect on the heap. 

  - Put logic for iclass method table ownership into constant table. In addition to this, T_ICLASS no longer 
 holds a function 
  - Remove calls reference to WB_UNPROTECT a ivar table. Code that access the ivar table through iclasses 
 are changed to access it through the object from which the iclass was made. This change 
 impacts autoload and insert write barriers for iclasses class variable lookup. 

 --- ## Why? 

 Code: https://github.com/ruby/ruby/pull/3410 

 This is the second version The main goal of this change. It's much simpler change is to make iclasses and it modules write barrier protected. At the moment, they are 
 doesn't introduce new garbage collected objects. I realized "shady", which means the GC has to do extra work to handle them. In code bases that despite use modules a lot, 
 saving iclasses can easily take up a pointer to some other object's method table, iclasses don't significant portion of the heap and impact GC time. Inserting write barriers was 
 mark tricky in the old setup, because of the way `M` and `iclass(M)` share the method tables. So, for each method table, there is an unique 
 object that's responsible for marking it. Since table. 

 Having write barriers are only 
 needed barrier for iclasses mean they can age in the object that is marking generational GC. 
 Once aged, the newly written value (correct me 
 if if I'm wrong here), having GC can sometimes skip subgraphs rooted at these objects, improving performance. 

 ## Impact to GC time 

 I measured the impact to minor GC time with the following steps: 
  - load an unique object that marks application 
  - run `GC::Profile.enable` 
  - allocate 50 million objects 
  - run `GC::Profile.report` 

 Here is the tables 
 makes things straight forward. impact to average minor GC time on various apps: 

 The numbers from v1 of this patch was a bit inflated because we were |Application               |       Before      |    After    | Speedup ratio | 
 [allocating an excessive amount of iclasses](https://github.com/ruby/ruby/commit/37e6c83609ac9d4c30ca4660ee16701e53cf82a3) |------------------------|---------------|---------|---------------| 
 so measured again. An |CRuby's test-all suite    |    2.438ms        | 2.289ms |     1.06          | 
 |`rails new` app that has an approximately 250MiB           |    1.911ms        | 1.798ms |     1.06          | 
 |Private app A             |    5.182ms        | 5.168ms |     1.00          | 
 |Private app B             |    185.7ms        | 107.9ms |     1.72          | 

 Private app A's heap size is about 22 MiB compared to B's 250 MiB. 
 saw App B boots up about 15% faster with this change. 

 ## Impact to class variable lookup 

 I included a 22% reduction benchmark in the patch to measure the impact to class variable lookup performance. 
 The difference seems negligible. 

 ## Conclusion 

 This change seems to reduce minor GC time. 
 
 time for real-world applications. 


 --- 

 Code: https://github.com/ruby/ruby/pull/3238 
 Credits to @tenderlovemaking for motivating coming up with the idea for this change. 

Back