Feature #11026
openHow atomic should dynamic regexp with "once" flag be?
Description
The test_once_multithread
test in ruby/test_regexp.rb tries to confirm that a dynamic "once" regexp is created atomically. Here's the test:
def test_once_multithread
m = Mutex.new
pr3 = proc{|i|
/#{m.unlock; sleep 0.5; i}/o
}
ary = []
n = 0
th1 = Thread.new{m.lock; ary << pr3.call(n+=1)}
th2 = Thread.new{m.lock; ary << pr3.call(n+=1)}
th1.join; th2.join
assert_equal([/1/, /1/], ary)
end
The logic here works as follows:
- Each thread locks the mutex, and then proceeds to execute the regexp with a different dynamic value
- Executing the regexp unlocks the mutex and applies the dynamic value
The test tries to guarantee that only the first thread to start processing the dynamic regexp will finally cache it, but I think this may be overreaching.
In order to make this test pass in a threaded implementation, the entire body of the regexp would need to be evaluated under synchronization, blocking other threads from doing the same. This would also require a synchronization lock to be acquired for every subsequent access of that regular expression, to ensure that multiple threads arriving at the code at the same time don't try to run the code at the same time or update the cached value twice. This could possibly be reduced with double-checked locking, but having this much synchronization overhead for a simple Ruby expression seems excessive.
Also, this synchronization would only protect threads from running the //o
body twice; it does not guarantee ordering of those threads outside of the //o
, so if any users depended on this "first thread wins" behavior, they'd still be surprised because they can't guarantee which thread hits the code first.
I believe this test should only confirm that all threads see the same regexp result, rather than trying to test that only one thread ever evaluates the regexp.
As far as specified behavior, I believe the only guarantee //o
should make is that it will cache exactly one value and never execute again after that value has been cached, rather than making guarantees about the atomicity of dynamic operations within that regexp. In short, it should guarantee all threads see the same //o
expression.
Thoughts?
Updated by headius (Charles Nutter) over 9 years ago
Filed https://github.com/jruby/jruby/issues/2798 to track this in JRuby.
Updated by nobu (Nobuyoshi Nakada) over 9 years ago
- Related to Bug #6701: once literal doesn't care escape added
Updated by nobu (Nobuyoshi Nakada) over 9 years ago
- Description updated (diff)
You mean the last assertion should be that ary
just has same regexps, not the exact values?
Updated by headius (Charles Nutter) over 9 years ago
Nobuyoshi Nakada wrote:
You mean the last assertion should be that
ary
just has same regexps, not the exact values?
I would find that more acceptable than synchronizing around all the embedded code in a dynamic regexp.
#6701 is interesting. I would claim that both cases are abnormal exits from code and therefor the "once" regexp never completes processing, but I cannot read Japanese to understand the bug.
I guess my general proposal is that we can only really guarantee that one thread and only one thread will ever complete processing a "once" regexp, though multiple may start processing.
Updated by hsbt (Hiroshi SHIBATA) almost 3 years ago
- Project changed from 14 to Ruby master