Project

General

Profile

Bug #15383

Reproducible crash: crash.rb:6: [BUG] unexpected THREAD_KILLED

Added by mbjs (Markus Schirp) almost 2 years ago. Updated over 1 year ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Target version:
-
ruby -v:
ruby 2.5.3p105 (2018-10-18 revision 65156) [x86_64-linux]
[ruby-core:90312]

Description

Hi,

I'm reporting a reliable crash of the ruby interpreter on contested mutexeses that are accessed in child processes.

I currently think that this happens as the child processes main thread, may be waiting for a parent process sibling thread that was holding the mutex at the time of the fork. After the fork is done, all sibling threads are dead, and the mutex detects the attempt to wait for a dead thread, bailing out.

This is simular, but not identical to the case here: https://bugs.ruby-lang.org/issues/14578

Here is a gist with some more test results on various platforms: https://gist.github.com/mbj/e6795ee5e0583c5541ee250e9942279a

I'm fine to get my hands dirty, but would need some pointers if my above conclusion points to the right direction.

Best,

Markus


Files

output.txt (21.3 KB) output.txt Crash log mbjs (Markus Schirp), 12/05/2018 01:33 PM
crash.rb (200 Bytes) crash.rb Reproduction script (super small) mbjs (Markus Schirp), 12/05/2018 01:33 PM

Related issues

Related to Ruby master - Bug #14634: Queue#push seems to crash after forkClosedActions
#1

Updated by normalperson (Eric Wong) almost 2 years ago

  • Backport changed from 2.4: UNKNOWN, 2.5: UNKNOWN to 2.4: UNKNOWN, 2.5: REQUIRED
#2

Updated by normalperson (Eric Wong) almost 2 years ago

  • Status changed from Open to Closed

Applied in changeset trunk|r66230.


thread_sync.c (mutex_ptr): handle mutexes held by parent threads in children

Mutexes may be held by threads which only exist in the parent
process, so their waitqueues may be populated with references
to other dead threads. We must reset them at fork.

I am a moron for introducing this bug :<

[ruby-core:90312] [Bug #15383]

Updated by normalperson (Eric Wong) almost 2 years ago

https://bugs.ruby-lang.org/issues/15383

Thanks, it affects trunk; just more difficult to reproduce
because of thread cache.

I'm a moron for not noticing this when I fixed other bugs :<

r66230 should fix it in trunk and should be backported
(but r66229 is independently broken and I just reverted it for now)

#4

Updated by mbjs (Markus Schirp) almost 2 years ago

  • Subject changed from Reproducible crash: crash.sh:6: [BUG] unexpected THREAD_KILLED to Reproducible crash: crash.rb:6: [BUG] unexpected THREAD_KILLED

Updated by mbjs (Markus Schirp) almost 2 years ago

Thanks for the quick fix. Also for marking the fix to be backported.

Just curious, is there an associated CI build for these changes?

Updated by normalperson (Eric Wong) almost 2 years ago

mbj@schirp-dso.com wrote:

Just curious, is there an associated CI build for these changes?

I check https://rubyci.org/ and http://ci.rvm.jp/ (and get
automated mails from the latter).

There's also TravisCI; but I don't use JavaScript; so I rely
on others giving me URLs to the raw logs.

#7

Updated by nagachika (Tomoyuki Chikanaga) almost 2 years ago

  • Related to Bug #14634: Queue#push seems to crash after fork added

Updated by normalperson (Eric Wong) over 1 year ago

https://bugs.ruby-lang.org/issues/15383

r66230 should fix it in trunk and should be backported

No, actually. r66230 hides an existing problem in the fix
for https://bugs.ruby-lang.org/issues/14578
...
Still working on this and my head hurts :<

Also available in: Atom PDF