Misc #14735


thread-safe operations in a hash could be documented

Added by rosenfeld (Rodrigo Rosenfeld Rosas) over 4 years ago. Updated over 4 years ago.



Hi, sometimes I find myself fetching data from the database through multiple queries concurrently. For example, suppose the application support multiple data-types which are independent from each other and we need to perform a set of operations per data-type. Usually I'd run one method for extracting the related data per data-type and would run them concurrently. Something like this:

require 'thread'
result = {} # assume this is thread-safe in MRI for now do |data_type, processor|
  Thread.start{ result[data_type] = }
end.each &:join

This code is quite simple and seems to always work with MRI. A more explicit equivalent code that should also work on other Ruby implementations without GIL would be probably written like:

require 'thread'
result = {}
result_semaphore = do |data_type, processor|
  Thread.start do
    result_for_data_type = # expensive call
    result_semaphore.synchronize{ result[data_type] = result_for_data_type }
end.each &:join

As you can see, it's much more code than the previous one. As I said initially, I use such pattern every now and then, so I'd love to be able to write the first code and being confident that it would always work as expected in MRI.

I've tried the following in order to see if I could cause an thread-unsafe case with MRI but it always return "[ 100000, 100000, nil ]":

require 'thread'
h = {}
(1..100000).map do |i|
  Thread.start{ h[i] = i }
end.each &:join

p h.keys.uniq.size, h.values.uniq.size, h.find{|k, v| k != v }

Is it just by chance? Or may I assume that will always be the case. Maybe it would be interesting to document somewhere what could be assumed to be true regarding thread-safeness for many methods. For example, there could be some link in the Hash documentation such as: "If you'd like to understand how each method behave in a multi-thread environment read this document" and point to another page explaining how it works.

By the way, the 'concurrent' gem seems to assume Hash is thread-safe in MRI as you can see here:

module Concurrent
  if Concurrent.on_cruby?
    class Hash < ::Hash;

Is this officially documented somewhere?


Also available in: Atom PDF