Project

General

Profile

Feature #11026

How atomic should dynamic regexp with "once" flag be?

Added by headius (Charles Nutter) over 4 years ago. Updated about 4 years ago.

Status:
Open
Priority:
Normal
Assignee:
-
Target version:
-
[ruby-core:<unknown>]

Description

The test_once_multithread test in ruby/test_regexp.rb tries to confirm that a dynamic "once" regexp is created atomically. Here's the test:

  def test_once_multithread
    m = Mutex.new
    pr3 = proc{|i|
      /#{m.unlock; sleep 0.5; i}/o
    }
    ary = []
    n = 0
    th1 = Thread.new{m.lock; ary << pr3.call(n+=1)}
    th2 = Thread.new{m.lock; ary << pr3.call(n+=1)}
    th1.join; th2.join
    assert_equal([/1/, /1/], ary)
  end

The logic here works as follows:

  • Each thread locks the mutex, and then proceeds to execute the regexp with a different dynamic value
  • Executing the regexp unlocks the mutex and applies the dynamic value

The test tries to guarantee that only the first thread to start processing the dynamic regexp will finally cache it, but I think this may be overreaching.

In order to make this test pass in a threaded implementation, the entire body of the regexp would need to be evaluated under synchronization, blocking other threads from doing the same. This would also require a synchronization lock to be acquired for every subsequent access of that regular expression, to ensure that multiple threads arriving at the code at the same time don't try to run the code at the same time or update the cached value twice. This could possibly be reduced with double-checked locking, but having this much synchronization overhead for a simple Ruby expression seems excessive.

Also, this synchronization would only protect threads from running the //o body twice; it does not guarantee ordering of those threads outside of the //o, so if any users depended on this "first thread wins" behavior, they'd still be surprised because they can't guarantee which thread hits the code first.

I believe this test should only confirm that all threads see the same regexp result, rather than trying to test that only one thread ever evaluates the regexp.

As far as specified behavior, I believe the only guarantee //o should make is that it will cache exactly one value and never execute again after that value has been cached, rather than making guarantees about the atomicity of dynamic operations within that regexp. In short, it should guarantee all threads see the same //o expression.

Thoughts?


Related issues

Related to Ruby master - Bug #6701: once literal doesn't care escapeClosed07/05/2012Actions

History

#2

Updated by nobu (Nobuyoshi Nakada) over 4 years ago

  • Related to Bug #6701: once literal doesn't care escape added
#3

Updated by nobu (Nobuyoshi Nakada) over 4 years ago

  • Description updated (diff)

You mean the last assertion should be that ary just has same regexps, not the exact values?

#4

Updated by headius (Charles Nutter) over 4 years ago

Nobuyoshi Nakada wrote:

You mean the last assertion should be that ary just has same regexps, not the exact values?

I would find that more acceptable than synchronizing around all the embedded code in a dynamic regexp.

#6701 is interesting. I would claim that both cases are abnormal exits from code and therefor the "once" regexp never completes processing, but I cannot read Japanese to understand the bug.

I guess my general proposal is that we can only really guarantee that one thread and only one thread will ever complete processing a "once" regexp, though multiple may start processing.

Also available in: Atom PDF