Feature #11026: How atomic should dynamic regexp with "once" flag be? - Ruby - Ruby Issue Tracking System

Feature #11026

Updated by nobu (Nobuyoshi Nakada) over 10 years ago

The `test_once_multithread` test_once_multithread test in ruby/test_regexp.rb tries to confirm that a dynamic "once" regexp is created atomically. Here's the test: 

 ~~~ruby 
   

   def test_once_multithread 
     m = Mutex.new 
     pr3 = proc{|i| 
       /#{m.unlock; sleep 0.5; i}/o 
     } 
     ary = [] 
     n = 0 
     th1 = Thread.new{m.lock; ary << pr3.call(n+=1)} 
     th2 = Thread.new{m.lock; ary << pr3.call(n+=1)} 
     th1.join; th2.join 
     assert_equal([/1/, /1/], ary) 
   end 
 ~~~ 

 The logic here works as follows: 

 * Each thread locks the mutex, and then proceeds to execute the regexp with a different dynamic value 
 * Executing the regexp unlocks the mutex and applies the dynamic value 

 The test tries to guarantee that only the first thread to start processing the dynamic regexp will finally cache it, but I think this may be overreaching. 

 In order to make this test pass in a threaded implementation, the entire body of the regexp would need to be evaluated under synchronization, blocking other threads from doing the same. This would also require a synchronization lock to be acquired for every subsequent access of that regular expression, to ensure that multiple threads arriving at the code at the same time don't try to run the code at the same time or update the cached value twice. This could possibly be reduced with double-checked locking, but having this much synchronization overhead for a simple Ruby expression seems excessive. 

 Also, this synchronization would only protect threads from running the `//o` //o body twice; it does not guarantee ordering of those threads outside of the `//o`, //o, so if any users depended on this "first thread wins" behavior, they'd still be surprised because they can't guarantee which thread hits the code first. 

 I believe this test should *only* confirm that all threads see the same regexp result, rather than trying to test that only one thread ever evaluates the regexp. 

 As far as specified behavior, I believe the only guarantee `//o` //o should make is that it will cache exactly one value and never execute again after that value has been cached, rather than making guarantees about the atomicity of dynamic operations within that regexp. In short, it should guarantee all threads see the same `//o` //o expression. 

 Thoughts?

Back

Project

General

Profile

Ruby

Feature #11026