Feature #22119
openThread: inherit storage on child threads
Description
Motivation¶
At work, we've been dealing with the need of propagating context across threads. The use cases are several, ranging from propagating logging / observability context, region / database shards, etc, on worker threads for specific workloads which require that implicit knowledge. We've historically used thread variables for it, via the Thread#thread_variable_get/set family of functions, and have been enforcing / gating usage of threads via a wrapper function which does the context propagation heavy-lifting. Something like:
def new_thread do
logging_context = Logging.current_context.deep_dup
database_context = DB.current_context.deep_dup
# etc, etc
Thread.new do
Logging.set_context(logging_context)
DB.set_context/database_context)
yield
end
end
alongside a plethora of checks and rubocops for forbid direct usage of Thread.new. However, we recently found out that that's not enough, since 3rd party libraries that we use will make their own use of Thread.new without our custom context propagation, which broke our logic in cases where we'd require p.ex. correcy database context.
In order to fix that, we've resorted to monkeypatching Thread.new (and Fiber.new), which feels necessary for this use case, but not right. This seems like a feature missing from ruby.
Research¶
ruby does have prior art for "inheritable" storage: Fiber#storage; when a new fiber is created, by default, it'll have the same storage from the fiber it was created from (unless Fiber.new(storage: nil)). This is exactly what's missing here.
Another example, with a little more ceremony, of the same problem solved in ruby is concurrent-ruby's ThreadLocalVar, which seems to be inspired in Java's own InheritableThreadLocal.
Proposal¶
ruby should have an OTTB solution for inheritable thread storage. Some options to consider:
1: :storage kwarg¶
An option would be to align with Fiber.new and support a storage kwarg:
Thread.current.thread_variable_set(:foo, "bar")
Thread.new(storage: nil) do # this has to be default
Thread.current.thread_variable_get(:foo) #=> nil
end
Thread.new(storage: Thread.current.storage) do # would be a new method
Thread.current.thread_variable_get(:foo) #=> nil
end
as mentioned above, for backwards compatibility reasons, by default storage wouldn't be inherited, which wouldn't fit our use-case that well, unless the default could be changed viat env var, i.e. RUBY_THREAD_STORAGE=inherit or something of the kind (which ruby historically hasn't supported).
2. Thread::Variable¶
A new primitive, Thread::Variable, could be integrated, that would have the same semantics as concurrent-ruby's ThreadLocalVar:
VAL = Thread::Variable.new(2)
Thread.new do
puts VAL.value #=> 2
VAL.value = 3
puts VAL.value #=> 3
end.join
puts VAL.value #=> 2
I feel like 1. is more aligned with how ruby has dealt with the problem space, but the backwards compatibility issues may be a problem, so 2 is probably more feasible, despite its Java'iness.