Project

General

Profile

Feature #17307

Updated by Eregon (Benoit Daloze) over 3 years ago

I would like to design a way to mark C extensions as thread-safe, Ractor-safe, or unsafe (= needs process-global lock). 
 By default, if not marked, C extensions would be treated as unsafe for compatibility. 

 Specifically, TruffleRuby supports C extensions, but for scalability it is important to run at least some of them in parallel (e.g., HTTP parsing in Puma). 
 This was notably mentioned in my [RubyKaigi talk](https://speakerdeck.com/eregon/running-rack-and-rails-faster-with-truffleruby?slide=17). 
 TruffleRuby defaults to acquire a global lock when executing C extension code for maximum compatibility (Ruby code OTOH can always run in parallel). 
 There is a command-line option for that lock and it can be disabled, but then it is disabled for all C extensions. 
 The important property for TruffleRuby is that the C extension does not need a global lock, i.e., that it synchronizes any mutable state in C that could be accessed by multiple threads, such as global C variables. 
 I believe many C extensions are already thread-safe, or can easily become thread-safe, because they do not rely on global state and do not share the RData objects between threads. 

 Ractor also needs a way to mark C extensions, to know if it's OK to use the C extension in multiple Ractors in parallel, and that the C extension will not leak non-shareable objects from one Ractor to another, which would lead leads to bugs & segfaults. 
 Otherwise, C extensions could only be used on the main/initial Ractor (or need to acquire a process-global lock whenever executing execution C extension code and ensure no non-shareable objects leak between Ractors), code), which would be a very big limitation (almost every non-trivial application depends on a C extension transitively). 

 In both cases, global state in the C extension needs synchronization. 
 In the thread-safe case, mutable state in C that could be accessed by multiple Ruby threads needs to be synchronized too (there might be no such state, e.g., if C extension objects are created per Thread). 
 In the Ractor case, the C extension must never pass an object from a Ractor to another, unless it is a shareable object. 

 What do you think would be a good way to "mark" C extensions? 
 Maybe defining a symbol in the C extension, similar to the `Init_foo` we have, like say `foo_is_thread_safe`/`foo_is_ractor_safe`? 
 A symbol including the C extension name seems best, to avoid any possible confusion when looking it up. 

 Maybe there are other ways to mark C extensions than defining symbols, that could still be read by the Ruby implementation reliably? 

 I used the term `C extensions` but of course it would apply to native extensions too (including C++/Rust/...). 

 cc @ko1

Back