Bug #17669
closedAn exception still breaks monitor state and causes deadlock in 2.6.7
Description
lib/monitor.rb
provides Monitor.
However, its state handling is weak for interrupts caused by Thread.kill for example timeout libraries
even after introducing some uses of Thread.handle_interrupt at https://bugs.ruby-lang.org/issues/15992.
Actually, timeout exception may happen everywhere.
If it raised when the thread is executing right before the begin block,
def mon_synchronize
# Prevent interrupt on handling interrupts; for example timeout errors
# it may break locking state.
-> Thread.handle_interrupt(Exception => :never){ mon_enter }
begin
yield
ensure
Thread.handle_interrupt(EXCEPTION_NEVER){ mon_exit }
end
end
it breaks the state of the monitor and it causes deadlock.
I can confirm that this happens either in 2.6.7 head and 2.6.6 release.
/bin/bash -c \
"date; ruby -v; ruby reproducible.rb; tail -n 10 /tmp/tmp.txt; date;" | tee ruby:2.6.7-macosx.log
docker run -it --rm -v `pwd`:`pwd` -w `pwd` ruby:2.6.6-alpine3.13 /bin/ash -c \
"date; ruby -v; ruby reproducible.rb; tail -n 10 /tmp/tmp.txt; date;" | tee ruby:2.6.6-alpine3.13.log
Technically, 2.5.8 is also reproducible because it shares the same releated code.
Incidentally, this doesn't happen in either 2.7.2 and 3.0.0 because the monitor was reimplemented in C.
Our production busy puma servers have suffered this weakness susceptible to timeouts, which frequently causes completely hung worker threads in a process.
The commit https://github.com/ruby/ruby/pull/4204/commits/e99c823f16918677b823255c44142910e02922c1 should fix this issue.
Files
Updated by Eregon (Benoit Daloze) about 4 years ago
This is the same bug that @headius (Charles Nutter) reported in https://github.com/ruby/monitor/issues/2.
I'd like to ask to make that repository public (currently it's private).
If the concern is that it might be confusing as the recent monitor stdlib does not use that source, how about renaming that repository, e.g. to monitor-rb
?
Updated by jeremyevans0 (Jeremy Evans) about 4 years ago
- Status changed from Open to Closed
- Backport changed from 2.5: UNKNOWN, 2.6: UNKNOWN, 2.7: UNKNOWN, 3.0: UNKNOWN to 2.5: UNKNOWN, 2.6: REQUIRED, 2.7: DONTNEED, 3.0: DONTNEED
Updated by Eregon (Benoit Daloze) over 3 years ago
It looks like this was missed to be backported (https://github.com/ruby/ruby/blob/ruby_2_6/lib/monitor.rb#L230-L239 does not have the fix).
The Backport field looks correct to me though.
Updated by Eregon (Benoit Daloze) over 3 years ago
Ah, 2.6.8 is in security maintenance, so maybe this is not considered then?