Feature #20876
Updated by ioquatix (Samuel Williams) 12 days ago
This is an evolution of the previous proposal: https://bugs.ruby-lang.org/issues/20855 ## Background The current Fiber Scheduler performance can be significantly impacted by blocking operations that cannot be deferred to the event loop, particularly in high-concurrency environments where Fibers rely on non-blocking operations for efficient task execution. ## Proposal Pull Request: https://github.com/ruby/ruby/pull/12016 We will introduce a new fiber scheduler hook called `blocking_operation_work`: ```ruby class MySchduler # ... def blocking_operation_wait(work) # Example (trivial) implementation: Thread.new(&work).join end end ``` We introduce a new flag for `rb_nogvl`: `RB_NOGVL_OFFLOAD_SAFE` `RB_NOGVL_BLOCKING_OPERATION` which indicates that `rb_nogvl(func, ...)` is a blocking operation that is safe to execute on a different thread or thread pool (or some other context). pool. When a C extension invokes `rb_nogvl(..., RB_NOGVL_OFFLOAD_SAFE)`, RB_NOGVL_BLOCKING_OPERATION)`, and a fiber scheduler is available, all the arguments will be saved into a instance of a callable object (at this time a `Proc`) called `work` and passed to the `blocking_operation_wait` fiber scheduler hook. `work`. When `work` is `#call`ed, it will execute `rb_nogvl` again with all the same arguments. The fiber scheduler can decide how to execute that work, e.g. on a separate thread or thread pool, thread, to mitigate the performance impact of the blocking operation on the event loop. ![](clipboard-202411071531-gw8tg.png) ![](clipboard-202411071018-ytvzs.png) ### Cancellation `rb_nogvl` takes several arguments, a `func` for the actual work, and `unblock_func` to cancel `func` if possible. These arguments are preserved in the `work` proc, and cancellation works the same. However, some extra effort may be required in the fiber scheduler hook, e.g. ```ruby class MySchduler # ... def blocking_operation_wait(work) thread = Thread.new(&work) thread.join thread = nil ensure thread&.kill end end ``` ### Interruption Points When using Currently `rb_nogvl` can take the `RB_NOGVL_OFFLOAD_SAFE` flag, the semantics of interruption points in `rb_nogvl` changes. Currently, flag `RB_NOGVL_INTR_FAIL` which prevents it from checking any interrupts. However, by default, `rb_nogvl` only checks implementation will check for interrupts **after** executing the `BLOCKING_REGION`. However, when using `RB_NOGVL_OFFLOAD_SAFE`, an interruption point is introduced **before** executing blocking operation. Since the `BLOCKING_REGION` beause we invoke fiber scheduler code path invokes Ruby code, that code (the fiber scheduler hook) before path also introduces checks for interrupts **before** executing the blocking operation is performed. operation. ## Example Using the branch of `async` gem: https://github.com/socketry/async/pull/352/files and enabling zlib deflate to use this feature, the following performance improvement was achieved: ```ruby require "zlib" require "async" require "benchmark" DATA = Random.new.bytes(1024*1024*100) duration = Benchmark.measure do Async do 10.times do Async do Zlib.deflate(DATA) end end end end # Ruby 3.3.4: ~16 seconds # Ruby 3.4.0 + PR: ~2 seconds. ``` To run this benchmark yourself, you must compile CRuby with these two PRs: - https://github.com/ruby/ruby/pull/12016 - https://github.com/ruby/zlib/pull/88 In addition, enable `RB_NOGVL_OFFLOAD_SAFE` `RB_NOGVL_BLOCKING_OPERATION` in `zlib.c`'s call to `rb_nogvl`. Then, use this branch of async: https://github.com/socketry/async/pull/352