Bug #4266: Timeouts in threads cause "ThreadError: deadlock; recursive locking" - Ruby - Ruby Issue Tracking System

Actions

Copy link

Bug #4266

closed

Timeouts in threads cause "ThreadError: deadlock; recursive locking"

Added by cjbottaro (Christopher Bottaro) over 14 years ago. Updated over 12 years ago.

Status:

Closed

Assignee:

kosaki (Motohiro KOSAKI)

Target version:

2.0.0

ruby -v:

ruby 1.9.2p136 (2010-12-25 revision 30365) [x86_64-darwin10.6.0]

Backport:

[ruby-core:34362]

Description

=begin
Run the attached file (or this pastie http://pastie.org/1448542) a few times and you'll eventually get:

ThreadError: deadlock; recursive locking:
internal:prelude:8:in lock' <internal:prelude>:8:in synchronize'
bin/deadlock_test.rb:86:in block (4 levels) in <main>' /Users/cjbottaro/.rvm/rubies/ruby-1.9.2-p136/lib/ruby/1.9.1/timeout.rb:57:in timeout'
bin/deadlock_test.rb:85:in block (3 levels) in <main>' bin/deadlock_test.rb:83:in times'
bin/deadlock_test.rb:83:in `block (2 levels) in '

I've had the script run successfully over 5 times in a row before getting the errors, so if it doesn't happen the first few times... keep trying.

The problem doesn't happen in 1.8.7 or Jruby, but does happen in 1.9.1.
=end

Files

Download all files

deadlock_test.rb (554 Bytes) deadlock_test.rb		cjbottaro (Christopher Bottaro), 01/12/2011 01:59 AM
diffdiff (1.38 KB) diffdiff	Add Mutex.own? method	kosaki (Motohiro KOSAKI), 01/12/2011 02:45 PM
mutex-synchronize-use-c-implementation.patch (1.28 KB) mutex-synchronize-use-c-implementation.patch		kosaki (Motohiro KOSAKI), 01/12/2011 04:45 PM
monitor-synchronize-use-c-implementation.patch (4.6 KB) monitor-synchronize-use-c-implementation.patch	Fix for MonitorMixin/monitor synchronize using similar approach as mutex synchronize patch	briangug (Brian Gugliemetti), 05/05/2011 09:01 AM

Related issues 2 (0 open — 2 closed)

Actions

Copy link

Updated by headius (Charles Nutter) over 14 years ago

=begin
FYI, JRuby does not use timeout.rb anymore (as of JRuby 1.3ish I think). We use a native implementation based on Java's timed, thread-pooling executor in java.util.concurrent. The original timeout has many race conditions that are very difficult to fix.
=end

Actions

Copy link

Updated by headius (Charles Nutter) over 14 years ago

=begin
A question was raised about whether #to should understand context, as in knowing a top-level object being coerced to YAML should include the --- header. The answer seems to come from Marshal.

Marshal.dump(object) knows how to produce the marshal header, linking, and so on. Marshal is the master of this format. But Marshal defers to the objects themselves for actual content to go into that marshaled output, calling _dump or marshal_dump on each object in turn. With the #to protocol, Marshal.dump could be implemented as:

def Marshal.dump(obj)
emit_header
obj.to(Marshal)
end

In the same way that marshal_dump is used today.

Another concern raised is whether #to should be expected on all objects all the time, rather than just always calling MyClass.coerce(obj). The answer is simple: you want individual source types to control how they coerce to a target type, rather than expecting the target type to know about all possible sources. The default protocol where #to calls type.coerce would simply be a default behavior.
=end

Actions

Copy link

Updated by headius (Charles Nutter) over 14 years ago

=begin
Sorry about that, wrong bug. Too many bug windows open.
=end

Actions

Copy link

Updated by headius (Charles Nutter) over 14 years ago

=begin
Sorry about that, wrong bug. Too many bug windows open.
=end

Actions

Copy link

Updated by kosaki (Motohiro KOSAKI) over 14 years ago

=begin
This test program also makes ruby crash on my envronment (trunk + linux x86_64).

Related [Bug #3880]?

internal:prelude:8: [BUG] Segmentation fault
ruby 1.9.3dev (2011-01-09 trunk 30500) [x86_64-linux]

-- Control frame information -----------------------------------------------
c:0011 p:---- s:0036 b:0036 l:000035 d:000035 CFUNC :exception
c:0010 p:---- s:0034 b:0034 l:000033 d:000033 CFUNC :lock
c:0009 p:0011 s:0031 b:0031 l:000030 d:000030 METHOD internal:prelude:8
c:0008 p:0011 s:0028 b:0028 l:000bb8 d:000027 BLOCK deadlock_test.rb:21
c:0007 p:0109 s:0026 b:0026 l:001ad0 d:001ad0 METHOD /usr/local/lib/ruby/1.9.1/timeout.rb:57
c:0006 p:0019 s:0014 b:0014 l:000bb8 d:002580 BLOCK deadlock_test.rb:20
c:0005 p:---- s:0011 b:0011 l:000010 d:000010 FINISH
c:0004 p:---- s:0009 b:0009 l:000008 d:000008 CFUNC :times
c:0003 p:0010 s:0006 b:0006 l:000bb8 d:0012c8 BLOCK deadlock_test.rb:18
c:0002 p:---- s:0004 b:0004 l:000003 d:000003 FINISH
c:0001 p:---- s:0002 b:0002 l:000001 d:000001 TOP

-- Ruby level backtrace information ----------------------------------------
deadlock_test.rb:18:in block (2 levels) in <main>' deadlock_test.rb:18:in times'
deadlock_test.rb:20:in block (3 levels) in <main>' /usr/local/lib/ruby/1.9.1/timeout.rb:57:in timeout'
deadlock_test.rb:21:in block (4 levels) in <main>' <internal:prelude>:8:in synchronize'
internal:prelude:8:in lock' <internal:prelude>:8:in exception'

-- C level backtrace information -------------------------------------------
./ruby() [0x51ff25] vm_dump.c:797
./ruby() [0x564296] error.c:249
./ruby(rb_bug+0xb1) [0x564431] error.c:266
./ruby() [0x4aec70] signal.c:624
/lib64/libpthread.so.0() [0x382140f440]
./ruby(st_lookup+0xe) [0x4b63ce] st.c:326
./ruby() [0x50d3af] vm_method.c:402
./ruby(rb_funcall2+0x2e) [0x51ccee] vm_eval.c:227
./ruby(rb_class_new_instance+0x30) [0x44f490] object.c:1570
./ruby() [0x516e33] vm_eval.c:79
./ruby() [0x517304] vm_eval.c:290
./ruby() [0x417691] eval.c:564
./ruby(rb_exc_raise+0x26) [0x417886] eval.c:473
./ruby() [0x52487a] thread.c:1301
./ruby(rb_mutex_lock+0x44a) [0x527cfa] thread.c:3266
./ruby() [0x50f0b7] vm_insnhelper.c:403
./ruby() [0x510a36] insns.def:1010
./ruby() [0x5157eb] vm.c:1150
./ruby(rb_yield+0x66) [0x51e116] vm.c:591
./ruby() [0x4469d1] numeric.c:3217
./ruby() [0x50f0b7] vm_insnhelper.c:403
./ruby() [0x510a36] insns.def:1010
./ruby() [0x5157eb] vm.c:1150
./ruby() [0x5186df] vm.c:607
./ruby() [0x5272d1] thread.c:453
./ruby() [0x5273e0] thread_pthread.c:522
/lib64/libpthread.so.0() [0x3821407761]
/lib64/libc.so.6(clone+0x6d) [0x3820ce151d]

=end

Actions

Copy link

Updated by kosaki (Motohiro KOSAKI) over 14 years ago

=begin
Hm, the crash issue seems different and unrelated one from this. becuase ruby_1_9_2 can reproduce
recursive deadlock issue, but can't reproduce crash.

=end

Actions

Copy link

Updated by kosaki (Motohiro KOSAKI) over 14 years ago

File diffdiff diffdiff added
Priority changed from Normal to 5

=begin
Old days, C# had similar issue. and they changed Monitor.enter(= our Mutex.lock) semantics.

http://www.bluebytesoftware.com/blog/2007/01/30/MonitorEnterThreadAbortsAndOrphanedLocks.aspx

So, Can we take similar way? The attached patch kill a race in mutex.synchronize. At least,
my ruby_1_9_2 branch don't reproduce the issue anymore.

Thanks.
=end

Actions

Copy link

Updated by kosaki (Motohiro KOSAKI) over 14 years ago

Assignee set to ko1 (Koichi Sasada)

=begin
And, I think asynchronous exception should be blocked and delayed while running ensure block.
Otherwise Mutex.unlock can be bypassed.

ko1, What do you think?
=end

Actions

Copy link

Updated by kosaki (Motohiro KOSAKI) over 14 years ago

=begin
That said, our Mutex.synchronize() method has two unsafe point.

def synchronize
self.lock
// (1)
begin
yield
ensure
// (2)
self.unlock rescue nil
end
end¶

If the mutex got thread#raise at (1), it doesn't call unlock because
it haven't entered -begin- block. And if got at (2), it also doesn't call
unlock because we are no longer in -begin- block.

My patch fixed only (1).

Thanks.
=end

Actions

Copy link

#10

Updated by kosaki (Motohiro KOSAKI) over 14 years ago

File mutex-synchronize-use-c-implementation.patch mutex-synchronize-use-c-implementation.patch added

=begin
Alternative one is here. This one fixes both (1) and (2). But it only fixes Mutex#synchronize. That said, timeout{} still might not call ruby level ensure block.

=end

Actions

Copy link

#11

Updated by kosaki (Motohiro KOSAKI) over 14 years ago

=begin
Hi,

I plan to commit mutex-synchronize-use-c-implementation.patch at this weekend. So if anyone have objection, please let me know soon.

=end

Actions

Copy link

#12

Updated by shyouhei (Shyouhei Urabe) over 14 years ago

Status changed from Open to Assigned

Actions

Copy link

#13 [ruby-core:35620]

Updated by normalperson (Eric Wong) about 14 years ago

=begin
I have a short-term fix for the related issue in Bug #4289
http://redmine.ruby-lang.org/attachments/1572/0001-timeout.rb-avoid-introducing-new-class-for-every-tim.patch

It works both with and without the mutex-synchronize-use-c-implementation.patch
posted here.

=end

Actions

Copy link

#14

Updated by briangug (Brian Gugliemetti) about 14 years ago

=begin
This issue also applies to MonitorMixin (monitor.rb) as it uses the same lock/begin yield/ensure block.
=end

Actions

Copy link

#15 [ruby-core:36010]

Updated by briangug (Brian Gugliemetti) about 14 years ago

File monitor-synchronize-use-c-implementation.patch monitor-synchronize-use-c-implementation.patch added

Actions

Copy link

#16 [ruby-core:37430]

Updated by nahi (Hiroshi Nakamura) about 14 years ago

Target version changed from 1.9.2 to 1.9.3

Actions

Copy link

#17 [ruby-core:37502]

Updated by ko1 (Koichi Sasada) about 14 years ago

Target version changed from 1.9.3 to 2.0.0

Let's talk it after 1.9.3 release.

Actions

Copy link

#18 [ruby-core:38947]

Updated by jtara (Jon Tara) almost 14 years ago

Brian Gugliemetti wrote:

File monitor-synchronize-use-c-implementation.patch added

There is a parameter mis-match between the declaration and definition of rb_mon_synchronize. Compilation fails on at least OSX.

Actions

Copy link

#19 [ruby-core:44443]

Updated by MingVonsalis (Ming Vonsalis) about 13 years ago

jtara (Jon Tara) wrote:

Brian Gugliemetti wrote:

File monitor-synchronize-use-c-implementation.patch added

There is a parameter mis-match between the declaration and definition of rb_mon_synchronize. Compilation fails on at least OSX.

http://originsofinuitart.posterous.com/

Actions

Copy link

#20 [ruby-core:49432]

Updated by trapni (Christian Parpart) over 12 years ago

We just ran into this very ugly bug in production, too. We would really appreciate a fix in 1.9.4, a pure bugfix release of the 1.9 branch since this is a major bug. :)

Actions

Copy link

#21 [ruby-core:49435]

Updated by ko1 (Koichi Sasada) over 12 years ago

Assignee changed from ko1 (Koichi Sasada) to kosaki (Motohiro KOSAKI)

I passed this ticket to thread specialist.

Actions

Copy link

#22 [ruby-core:49473]

Updated by kosaki (Motohiro KOSAKI) over 12 years ago

Brian,

Can you please provide us a reproducer for you monitor.rb issue? And can you please make a new ticket for your issue to prevent cross line discussion? I plan to close this ticket soon and I also hope to don't forget your issue.

Thanks.

Actions

Copy link

#23

Updated by kosaki (Motohiro KOSAKI) over 12 years ago

Status changed from Assigned to Closed
% Done changed from 0 to 100

This issue was solved with changeset r37724.
Christopher, thank you for reporting this issue.
Your contribution to Ruby is greatly appreciated.
May Ruby be with you.

prelude.rb: Moved Mutex#synchronize to
thread.c (rb_mutex_synchronize_m): here. [Bug #4266]

Actions

Copy link

Also available in: Atom PDF

Like0

Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0

Project

General

Profile

Ruby

Tags

Custom queries

Bug #4266

Timeouts in threads cause "ThreadError: deadlock; recursive locking"

Updated by headius (Charles Nutter) over 14 years ago

Updated by headius (Charles Nutter) over 14 years ago

Updated by headius (Charles Nutter) over 14 years ago

Updated by headius (Charles Nutter) over 14 years ago

Updated by kosaki (Motohiro KOSAKI) over 14 years ago

Updated by kosaki (Motohiro KOSAKI) over 14 years ago

Updated by kosaki (Motohiro KOSAKI) over 14 years ago

Updated by kosaki (Motohiro KOSAKI) over 14 years ago

Updated by kosaki (Motohiro KOSAKI) over 14 years ago

def synchronize
self.lock
// (1)
begin
yield
ensure
// (2)
self.unlock rescue nil
end
end¶

Updated by kosaki (Motohiro KOSAKI) over 14 years ago

Updated by kosaki (Motohiro KOSAKI) over 14 years ago

Updated by shyouhei (Shyouhei Urabe) over 14 years ago

Updated by normalperson (Eric Wong) about 14 years ago

Updated by briangug (Brian Gugliemetti) about 14 years ago

Updated by briangug (Brian Gugliemetti) about 14 years ago

Updated by nahi (Hiroshi Nakamura) about 14 years ago

Updated by ko1 (Koichi Sasada) about 14 years ago

Updated by jtara (Jon Tara) almost 14 years ago

Updated by MingVonsalis (Ming Vonsalis) about 13 years ago

Updated by trapni (Christian Parpart) over 12 years ago

Updated by ko1 (Koichi Sasada) over 12 years ago

Updated by kosaki (Motohiro KOSAKI) over 12 years ago

Updated by kosaki (Motohiro KOSAKI) over 12 years ago

Project

General

Profile

Ruby

Tags

Custom queries

Bug #4266

Timeouts in threads cause "ThreadError: deadlock; recursive locking"

Updated by headius (Charles Nutter) over 14 years ago

Updated by headius (Charles Nutter) over 14 years ago

Updated by headius (Charles Nutter) over 14 years ago

Updated by headius (Charles Nutter) over 14 years ago

Updated by kosaki (Motohiro KOSAKI) over 14 years ago

Updated by kosaki (Motohiro KOSAKI) over 14 years ago

Updated by kosaki (Motohiro KOSAKI) over 14 years ago

Updated by kosaki (Motohiro KOSAKI) over 14 years ago

Updated by kosaki (Motohiro KOSAKI) over 14 years ago

def synchronize self.lock // (1) begin yield ensure // (2) self.unlock rescue nil end end¶

Updated by kosaki (Motohiro KOSAKI) over 14 years ago

Updated by kosaki (Motohiro KOSAKI) over 14 years ago

Updated by shyouhei (Shyouhei Urabe) over 14 years ago

Updated by normalperson (Eric Wong) about 14 years ago

Updated by briangug (Brian Gugliemetti) about 14 years ago

Updated by briangug (Brian Gugliemetti) about 14 years ago

Updated by nahi (Hiroshi Nakamura) about 14 years ago

Updated by ko1 (Koichi Sasada) about 14 years ago

Updated by jtara (Jon Tara) almost 14 years ago

Updated by MingVonsalis (Ming Vonsalis) about 13 years ago

Updated by trapni (Christian Parpart) over 12 years ago

Updated by ko1 (Koichi Sasada) over 12 years ago

Updated by kosaki (Motohiro KOSAKI) over 12 years ago

Updated by kosaki (Motohiro KOSAKI) over 12 years ago

def synchronize
self.lock
// (1)
begin
yield
ensure
// (2)
self.unlock rescue nil
end
end¶