Project

General

Profile

Feature #20443

Updated by eightbitraptor (Matt V-H) 7 months ago

[[Github PR #10598]](https://github.com/ruby/ruby/pull/10598) 

 ## Background 

 Ruby's GC running during Rails requests can have negative impacts on currently 
 running requests, causing applications to have high tail-latency. 

 A technique to mitigate this high tail-latency is Out-of-band GC (OOBGC). This 
 is basically where the application is run with GC disabled, and then GC is 
 explicitly started after each request, or when no requests are in progress. 

 This can reduce the tail latency, but also introduces problems of its own. Long 
 GC pauses after each request reduce throughput. This is more pronounced on 
 threading servers like Puma because all the threads have to finish processing 
 user requests and be "paused" before OOBGC can be triggered. 

 This throughput decrease happens for a couple of reasons: 

 1. There are few heuristics available for users to determine when GC should run, 
 this means that in OOBGC scenarios, it's possible that major GC's are being run 
 more than necessary.    2. The lack of any GC during a request means that lots of 
 garbage objects have been created and not cleaned up, so the process is using 
 more memory than it should - requiring major GC's run as part of OOBGC to do 
 more work and therefore take more time. 

 This ticket attempts to address these issues by: 

 1. Provide `GC.disable_major` and its antonym `GC.enable_major` to disable and 
 enable only major GC 2. Provide `GC.needs_major?` as a basic heuristic allowing 
 users to tell when Ruby should run a Major GC. 

 These ideas were originally proposed by @ko1 and @byroot in [this rails 
 issue](https://github.com/rails/rails/issues/50449) 

 Disabling GC major's would still allow minor GC's to run during the request, 
 avoiding the ballooning memory usage caused by not running GC at all, and 
 reducing the time that a major takes when we do run it, because the nursery 
 objects have been cleaned up during the request already so there is less work 
 for a major GC to do. 

 This can be used in combination with `GC.needs_major?` to selectively run an 
 OOBGC only when necessary 

 ## Implementation 

 This PR adds 3 new methods to the `GC` module 

 - `GC.disable_major` This prevents major GC's from running automatically. It 
   does not restrict minors. When `objspace->rgengc.need_major_gc` is set and a 
   GC is run, instead of running a major, new heap pages will be allocated and a 
   minor run instead. `objspace->rgengc.need_major_gc` will remain set until a 
   major is manually run. If a major is not manually run then the process will 
   eventually run out of memory. 
  
   When major GC's are disabled, object promotion is disabled. That is, no 
   objects will increment their ages during a minor GC. This is to attempt to 
   minimise heap growth during the period between major GC's, by restricting the 
   number of old-gen objects that will remain unconsidered by the GC until the 
   next major. 
  
   When `GC.start` is run, then major GC's will be enabled, a GC triggered with 
   the options passed to `GC.start`, and then `disable_major` will be set to the 
   state it was in before `GC.start` was called. 
  
 - `GC.enable_major` This simply unsets the bit preventing major GC's. This will 
   revert the GC to normal generational behaviour. Everything behaves as default 
   again. 

 - `GC.needs_major?` This exposes the value of `objspace->rgengc.need_major_gc` 
   to the user level API. This is already exposed in 
   `GC.latest_gc_info[:need_major_by]` but I felt that a simpler interface would 
   make this easier to use and result in more readable code. eg. 
  
 ``` 
 out_of_band do  
   GC.start if GC.needs_major?   
 end  
 ``` 

 Because object aging is disabled when majors are disabled it is recommended to 
 use this in conjunction with `Process.warmup`, which will prepare the heap by 
 running a major GC, compacting the heap, and promoting every remaining object to 
 old-gen. This ensures that minor GC's are running over the smallets possible set 
 of young objects when `GC.disable_major` is true. 

 ## Benchmarks 

 We ran some tests in production on Shopify's core monolith over a weekend and 
 found that: 

 **Mean time spent in GC, as well as p99.9 and p99.99 GC times are all 
 improved.**  

 <img width="1000" alt="Screenshot 2024-04-22 at 16 41 49" 
 src="https://github.com/ruby/ruby/assets/31869/6cff5b11-2e21-40c1-bb84-d994e0e1798d"> 

 **p99 GC time is slightly higher.**  

 <img width="1000" alt="Screenshot 2024-04-22 at 16 44 55" 
 src="https://github.com/ruby/ruby/assets/31869/dc645cbe-9495-46f0-8485-24e790c42f32"> 

 We're running far fewer OOBGC major GC's now that we have `GC.needs_major?` than 
 we were before, and we believe that this is contributing to a slightly increased 
 number of minor GC's. raising the p99 slightly. 

 **App response times are all improved** 

 We see a ~2% 9% reduction in average and p99 response times when compared againststandard against 
 standard GC  
 (~7% p99, ~3% (4% p99.9 and ~4% p99.99). 

 <img width="1000" alt="Screenshot 2024-04-23 2024-04-22 at 09 27 17" src="https://gist.github.com/assets/31869/70e81fa5-77b2-469a-8945-88bf8f8fefe9"> 

 16 55 53" 
 src="https://github.com/ruby/ruby/assets/31869/8a80c102-1564-4bc9-ba44-6e9a8b85f971"> 


 This drops slightly to an a ~1% 8% reduction in average and p99 response times when 
 compared 
 against our normal standard OOBGC approach    (~6% p99, ~2% (3.59 p99.9 and ~3% p99.99). 4% p99.99) 

 <img width="1000" alt="Screenshot 2024-04-23 2024-04-22 at 09 27 29" src="https://gist.github.com/assets/31869/cbaa3807-0cd1-4dba-a5e6-b9df91024d73"> 


 EDIT: to correct a formula error in the original Average charts, numbers updated. 16 56 10" 
 src="https://github.com/ruby/ruby/assets/31869/1baef7ea-0155-4ff9-8ba4-a967b75749fe"> 



Back