Project

General

Profile

Actions

Bug #15320

closed

IO.popen with MJIT worker thread may deadlock

Added by k0kubun (Takashi Kokubun) over 5 years ago. Updated over 5 years ago.

Status:
Closed
Assignee:
-
Target version:
-
[ruby-core:89877]

Description

In r65785, we added mjit_pause on before_fork_ruby. We seem to have introduced a few deadlocks by that. This was discussed in https://bugs.ruby-lang.org/issues/15316, but I filed a separate ticket because this issue is independent from postponed_job race.

Example deadlock:

_pthread_mutex_unlock_usercnt (decr=1, mutex=mutex@entry=0x55a4ee3b7520 <mjit_engine_mutex>) at pthread_mutex_unlock.c:38
>>> Machine level backtrace

Thread 2 (Thread 0x7f9ea8981700 (LWP 20561)):
#0  0x00007f9ec2b4acf6 in __GI_ppoll (fds=fds@entry=0x7f9ea897fb08, nfds=nfds@entry=1, timeout=<optimized out>, sigmask=sigmask@entry=0x0) at ../sysdeps/unix/sysv/linux/ppoll.c:39
#1  0x000055a4edfd0f37 in rb_sigwait_sleep (th=th@entry=0x0, sigwait_fd=sigwait_fd@entry=3, rel=rel@entry=0x0) at /home/ko1/ruby/src/trunk-mjit-wait/thread_pthread.c:1969
#2  0x000055a4edf3b0dc in ruby_waitpid_locked (vm=vm@entry=0x55a4ee90b950, pid=pid@entry=20590, status=status@entry=0x7f9ea897fbfc, options=options@entry=0, cond=cond@entry=0x7f9ea897fc00) at /home/ko1/ruby/src/trunk-mjit-wait/process.c:1070
#3  0x000055a4eded2e82 in exec_process (path=<optimized out>, argv=argv@entry=0x7f9ebc0061e0) at /home/ko1/ruby/src/trunk-mjit-wait/mjit_worker.c:611
#4  0x000055a4eded4528 in link_o_to_so (so_file=0x7f9ea897fc70 "/tmp/_ruby_mjit_p27024u127.so", o_files=0x7f9ea897fd20) at /home/ko1/ruby/src/trunk-mjit-wait/mjit_worker.c:824
#5  convert_unit_to_func (unit=unit@entry=0x55a4eeb2f790, cc_entries=<optimized out>, is_entries=<optimized out>) at /home/ko1/ruby/src/trunk-mjit-wait/mjit_worker.c:1090
#6  0x000055a4eded4e52 in mjit_worker () at /home/ko1/ruby/src/trunk-mjit-wait/mjit_worker.c:1217
#7  0x000055a4edfcb15a in mjit_worker (arg=0x55a4eded4a20 <mjit_worker>) at /home/ko1/ruby/src/trunk-mjit-wait/thread_pthread.c:1876
#8  0x00007f9ec3a916db in start_thread (arg=0x7f9ea8981700) at pthread_create.c:463
#9  0x00007f9ec2b5788f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 1 (Thread 0x7f9ec40df740 (LWP 27024)):
#0  __pthread_mutex_unlock_usercnt (decr=1, mutex=mutex@entry=0x55a4ee3b7520 <mjit_engine_mutex>) at pthread_mutex_unlock.c:38
#1  __GI___pthread_mutex_unlock (mutex=mutex@entry=0x55a4ee3b7520 <mjit_engine_mutex>) at pthread_mutex_unlock.c:345
#2  0x000055a4edfcf3e9 in rb_native_mutex_unlock (lock=lock@entry=0x55a4ee3b7520 <mjit_engine_mutex>) at /home/ko1/ruby/src/trunk-mjit-wait/thread_pthread.c:400
#3  0x000055a4eded6659 in CRITICAL_SECTION_FINISH (msg=0x55a4ee0d7226 "in stop_worker", level=3) at /home/ko1/ruby/src/trunk-mjit-wait/mjit_worker.c:412
#4  stop_worker () at /home/ko1/ruby/src/trunk-mjit-wait/mjit.c:689
#5  mjit_pause (wait_p=wait_p@entry=0) at /home/ko1/ruby/src/trunk-mjit-wait/mjit.c:719
#6  0x000055a4edf3e603 in before_fork_ruby () at /home/ko1/ruby/src/trunk-mjit-wait/process.c:1510
#7  rb_fork_ruby (status=status@entry=0x7ffde545cce8) at /home/ko1/ruby/src/trunk-mjit-wait/process.c:4011
#8  0x000055a4edeb1482 in pipe_open (execarg_obj=execarg_obj@entry=8, fmode=3, convconfig=convconfig@entry=0x7ffde545cdf0, modestr=<optimized out>) at /home/ko1/ruby/src/trunk-mjit-wait/io.c:6592
#9  0x000055a4edeb29e2 in rb_io_s_popen (argc=<optimized out>, argv=<optimized out>, klass=94166865734200) at /home/ko1/ruby/src/trunk-mjit-wait/io.c:6868
#10 0x000055a4ee00abc5 in vm_call_cfunc_with_frame (ci=0x55a4eef36ca0, cc=<optimized out>, calling=<optimized out>, reg_cfp=0x7f9ec40dd9b8, ec=0x55a4ee90c2e8) at /home/ko1/ruby/src/trunk-mjit-wait/vm_insnhelper.c:1908
#11 vm_call_cfunc (ec=0x55a4ee90c2e8, reg_cfp=0x7f9ec40dd9b8, calling=<optimized out>, ci=0x55a4eef36ca0, cc=<optimized out>) at /home/ko1/ruby/src/trunk-mjit-wait/vm_insnhelper.c:1924
#12 0x000055a4ee016033 in vm_call_method (ec=0x55a4ee90c2e8, cfp=0x7f9ec40dd9b8, calling=<optimized out>, ci=<optimized out>, cc=<optimized out>) at /home/ko1/ruby/src/trunk-mjit-wait/vm_insnhelper.c:2397
#13 0x000055a4ee01ce93 in vm_exec_core (ec=ec@entry=0x55a4ee90c2e8, initial=initial@entry=0) at ../../src/trunk-mjit-wait/insns.def:751
#14 0x000055a4ee01292c in rb_vm_exec (ec=0x55a4ee90c2e8, mjit_enable_p=1) at /home/ko1/ruby/src/trunk-mjit-wait/vm.c:1825
#15 0x00007f9ea8985491 in ?? ()
#16 0x000055a4ee90c2e8 in ?? ()
#17 0x00007ffde545d280 in ?? ()
#18 0x00007ffd00000000 in ?? ()
#19 0x0000000000000003 in ?? ()
#20 0x0000000000000000 in ?? ()
>>> Dump Ruby level backtrace
th: 0x55a4ee90be60, native_id: 0x7f9ec40df740
-- Control frame information -----------------------------------------------
c:0030 p:---- s:0190 e:000189 CFUNC  :popen
c:0029 p:0015 s:0184 e:000183 METHOD /home/ko1/ruby/src/trunk-mjit-wait/test/ruby/test_process.rb:1776
c:0028 p:0059 s:0177 e:000176 METHOD /home/ko1/ruby/src/trunk-mjit-wait/test/lib/test/unit.rb:1181 [FINISH]
c:0027 p:0058 s:0171 e:000170 METHOD /home/ko1/ruby/src/trunk-mjit-wait/test/lib/minitest/unit.rb:1291 [FINISH]
c:0026 p:0015 s:0162 e:000161 METHOD /home/ko1/ruby/src/trunk-mjit-wait/test/lib/test/unit/testcase.rb:18
c:0025 p:0082 s:0157 e:000156 BLOCK  /home/ko1/ruby/src/trunk-mjit-wait/test/lib/minitest/unit.rb:952 [FINISH]
c:0024 p:---- s:0150 e:000149 CFUNC  :map
c:0023 p:0133 s:0146 e:000145 METHOD /home/ko1/ruby/src/trunk-mjit-wait/test/lib/minitest/unit.rb:945
c:0022 p:0045 s:0134 e:000133 METHOD /home/ko1/ruby/src/trunk-mjit-wait/test/lib/test/unit.rb:1044
c:0021 p:0011 s:0127 e:000125 BLOCK  /home/ko1/ruby/src/trunk-mjit-wait/test/lib/test/unit.rb:552 [FINISH]
c:0020 p:---- s:0121 e:000120 CFUNC  :each
c:0019 p:0057 s:0117 e:000116 METHOD /home/ko1/ruby/src/trunk-mjit-wait/test/lib/test/unit.rb:550
c:0018 p:0009 s:0110 e:000109 METHOD /home/ko1/ruby/src/trunk-mjit-wait/test/lib/test/unit.rb:587
c:0017 p:0460 s:0103 e:000101 METHOD /home/ko1/ruby/src/trunk-mjit-wait/test/lib/test/unit.rb:512
c:0016 p:0048 s:0089 e:000088 METHOD /home/ko1/ruby/src/trunk-mjit-wait/test/lib/test/unit.rb:548
c:0015 p:0009 s:0082 e:000081 METHOD /home/ko1/ruby/src/trunk-mjit-wait/test/lib/test/unit.rb:587
c:0014 p:0155 s:0075 e:000074 METHOD /home/ko1/ruby/src/trunk-mjit-wait/test/lib/minitest/unit.rb:888
c:0013 p:0018 s:0061 e:000060 METHOD /home/ko1/ruby/src/trunk-mjit-wait/test/lib/test/unit.rb:981
c:0012 p:0006 s:0056 e:000055 METHOD /home/ko1/ruby/src/trunk-mjit-wait/test/lib/minitest/unit.rb:1110
c:0011 p:0007 s:0052 e:000051 BLOCK  /home/ko1/ruby/src/trunk-mjit-wait/test/lib/minitest/unit.rb:1097 [FINISH]
c:0010 p:---- s:0048 e:000047 CFUNC  :each
c:0009 p:0058 s:0044 e:000043 METHOD /home/ko1/ruby/src/trunk-mjit-wait/test/lib/minitest/unit.rb:1096
c:0008 p:0016 s:0039 e:000038 METHOD /home/ko1/ruby/src/trunk-mjit-wait/test/lib/minitest/unit.rb:1084
c:0007 p:0009 s:0034 e:000033 METHOD /home/ko1/ruby/src/trunk-mjit-wait/test/lib/test/unit.rb:790
c:0006 p:0009 s:0028 e:000027 METHOD /home/ko1/ruby/src/trunk-mjit-wait/test/lib/test/unit.rb:620
c:0005 p:0017 s:0022 e:000021 METHOD /home/ko1/ruby/src/trunk-mjit-wait/test/lib/test/unit.rb:33
c:0004 p:0037 s:0017 e:000016 METHOD /home/ko1/ruby/src/trunk-mjit-wait/test/lib/test/unit.rb:1143
c:0003 p:0011 s:0013 e:000012 METHOD /home/ko1/ruby/src/trunk-mjit-wait/test/lib/test/unit.rb:1147
c:0002 p:0261 s:0008 E:001458 EVAL   /home/ko1/ruby/src/trunk-mjit-wait/test/runner.rb:33 [FINISH]
c:0001 p:0000 s:0003 E:001230 (none) [FINISH]

As the backtrace looks similar to https://bugs.ruby-lang.org/issues/14867#note-82, I suspect IO.popen has race with MJIT worker for waiting_pids handling and RUBY_VM_CHECK_INTS inside before_fork_ruby might unexpectedly proceed SIGCHLD handler (for gcc process) while waitpid list is not prepared for it yet.


Related issues 2 (0 open2 closed)

Related to Ruby master - Bug #15316: rb_postponed_job_register not thread-safeClosedk0kubun (Takashi Kokubun)Actions
Related to Ruby master - Bug #14867: Process.wait can wait for MJIT compiler processClosednormalperson (Eric Wong)Actions
Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0Like0Like0