Project

General

Profile

Actions

Bug #20047

open

ConditionVariable#wait has spurious wakeups from signal traps

Added by Eregon (Benoit Daloze) 5 months ago. Updated 5 months ago.

Status:
Open
Assignee:
-
Target version:
-
ruby -v:
ruby 3.3.0dev (2023-11-27T17:17:52Z master cc05a60c16) [x86_64-linux]
[ruby-core:115645]

Description

Signal.trap("INT") { p :SIGINT }

Thread.new do
  sleep 0.6
  `kill -INT #{$$}`
end

m, cv = Mutex.new, ConditionVariable.new
m.synchronize do
  r = ARGV[0] ? cv.wait(m, 2) : cv.wait(m)
  p ["ConditionVariable#wait returned", r]
end

The above program (without CLI arguments) should hang on .wait and not return, because neither ConditionVariable#{signal,broadcast} are used.
That's the behavior on TruffleRuby and JRuby, but not on CRuby, where .wait wakes up spuriously.

$ ruby -v spurious_cv.rb
ruby 3.2.2 (2023-03-30 revision e51014f9c0) [x86_64-linux]
:SIGINT
["ConditionVariable#wait returned", 1]

$ ruby -v spurious_cv.rb   
truffleruby 23.1.1, like ruby 3.2.2, Oracle GraalVM Native [x86_64-linux]
:SIGINT
# hangs as expected

$ ruby -v spurious_cv.rb
jruby 9.4.5.0 (3.1.4) 2023-11-02 1abae2700f OpenJDK 64-Bit Server VM 17.0.8+7 on 17.0.8+7 +jit [x86_64-linux]
:SIGINT
# hangs as expected

When given an argument, it should wait 2 seconds.
But on CRuby it wakes up spuriously:

$ ruby -v spurious_cv.rb timeout
ruby 3.2.2 (2023-03-30 revision e51014f9c0) [x86_64-linux]
:SIGINT
["ConditionVariable#wait returned", 0]

$ ruby -v spurious_cv.rb timeout
truffleruby 23.1.1, like ruby 3.2.2, Oracle GraalVM Native [x86_64-linux]
:SIGINT
["ConditionVariable#wait returned", #<ConditionVariable:0x188>]

$ ruby -v spurious_cv.rb timeout
jruby 9.4.5.0 (3.1.4) 2023-11-02 1abae2700f OpenJDK 64-Bit Server VM 17.0.8+7 on 17.0.8+7 +jit [x86_64-linux]
:SIGINT
["ConditionVariable#wait returned", #<Thread::ConditionVariable:0x482ba4b1>]

ConditionVariable#wait needs to be interrupted to execute the signal handler, which does { p :SIGINT } on the main thread.
However, ConditionVariable#wait should automatically be restarted internally after that, with the remaining timeout.
That is what I think is the bug in CRuby.

While it's good practice to have a loop around ConditionVariable#wait (at least when there is no timeout), it still seems highly unexpected in a high-level language like Ruby
to have ConditionVariable#wait return when neither ConditionVariable#{signal,broadcast} are used (i.e., spurious wakeups).

Also adding a loop is non-trivial for the case where a timeout argument is passed, as then one needs to manually account the remaining timeout instead of letting ConditionVariable#wait do its job correctly.
And also need to check that if ConditionVariable#wait returns nil then one should break the loop, which is quite error-prone.
Instead of just using cv.wait(mutex, timeout) when it works correctly.

From https://github.com/ruby-concurrency/concurrent-ruby/issues/1015

Actions #1

Updated by Eregon (Benoit Daloze) 5 months ago

  • Description updated (diff)
Actions

Also available in: Atom PDF

Like0
Like0