Project

General

Profile

Actions

Backport #7134

closed

Signal handling bug in Mac OS X

Added by auastro (Andy Kitchen) over 11 years ago. Updated almost 2 years ago.


Description

On Mac OS X, running the attached program causes the exception below about 1/10 times it is run.

$ ruby hup.rb
hup.rb:2:in write': deadlock; recursive locking (ThreadError) from hup.rb:2:in puts'
from hup.rb:2:in puts' from hup.rb:2:in block in '
from hup.rb:6:in call' from hup.rb:6:in write'
from hup.rb:6:in puts' from hup.rb:6:in puts'
from hup.rb:6:in `'

The expected output is:

In Hup Handler
Finished...

or

Finished...
In Hup Handler

My ruby is compiled with clang:

$ clang --version
Apple clang version 4.1 (tags/Apple/clang-421.11.65) (based on LLVM 3.1svn)
Target: x86_64-apple-darwin12.2.0
Thread model: posix


Files

hup.rb (94 Bytes) hup.rb execution of this script causes strange behaviour auastro (Andy Kitchen), 10/10/2012 05:36 PM
mutex_notrap.patch (4.32 KB) mutex_notrap.patch kosaki (Motohiro KOSAKI), 11/06/2012 04:40 AM

Updated by gustavoanatoly (Gustavo Anatoly Fernandes Velasquez Solis) over 11 years ago

This error it happens when kill is called:

https://gist.github.com/3906937

Followed by:

https://gist.github.com/3906907

But I'm trying to understand the bug causes.

Updated by gustavoanatoly (Gustavo Anatoly Fernandes Velasquez Solis) over 11 years ago

Hi,

Could somenone please, explain if is possible use recursive lock (PTHREAD_MUTEX_RECURSIVE) attributes in pthread.h, to fix this bug?

Updated by mame (Yusuke Endoh) over 11 years ago

  • Status changed from Open to Assigned
  • Assignee set to kosaki (Motohiro KOSAKI)

Kosaki-san, do you know anything about this?

--
Yusuke Endoh

Updated by gustavoanatoly (Gustavo Anatoly Fernandes Velasquez Solis) over 11 years ago

Hi, Yusuke.

I was studying this bug and I did a little bit changes to test

git://gist.github.com/4017432.git

because self lock is not allowed on ruby, please correct me if I wrong.

Thanks.

2012/11/5 mame (Yusuke Endoh)

Issue #7134 has been updated by mame (Yusuke Endoh).

Status changed from Open to Assigned
Assignee set to kosaki (Motohiro KOSAKI)

Kosaki-san, do you know anything about this?

--
Yusuke Endoh

Bug #7134: Signal handling bug in Mac OS X
https://bugs.ruby-lang.org/issues/7134#change-32417

Author: auastro (Andy Kitchen)
Status: Assigned
Priority: Normal
Assignee: kosaki (Motohiro KOSAKI)
Category:
Target version: 1.9.3
ruby -v: ruby 1.9.3p194 (2012-04-20 revision 35410) [x86_64-darwin12]

On Mac OS X, running the attached program causes the exception below about
1/10 times it is run.

$ ruby hup.rb
hup.rb:2:in write': deadlock; recursive locking (ThreadError) from hup.rb:2:in puts'
from hup.rb:2:in puts' from hup.rb:2:in block in '
from hup.rb:6:in call' from hup.rb:6:in write'
from hup.rb:6:in puts' from hup.rb:6:in puts'
from hup.rb:6:in `'

The expected output is:

In Hup Handler
Finished...

or

Finished...
In Hup Handler

My ruby is compiled with clang:

$ clang --version
Apple clang version 4.1 (tags/Apple/clang-421.11.65) (based on LLVM 3.1svn)
Target: x86_64-apple-darwin12.2.0
Thread model: posix

--
http://bugs.ruby-lang.org/

--
» Gustavo Anatoly Fernandes Velásquez Solís
» Information Systems - Bachelor
» Gmail:
» Skype: gustavoanatoly
» Cell Phone: +55 (83) 9918-1763

Updated by gustavoanatoly (Gustavo Anatoly Fernandes Velasquez Solis) over 11 years ago

Sorry, I sent wrong gist link link: https://gist.github.com/4017432

2012/11/5 Gustavo Anatoly

Hi, Yusuke.

I was studying this bug and I did a little bit changes to test

git://gist.github.com/4017432.git

because self lock is not allowed on ruby, please correct me if I wrong.

Thanks.

2012/11/5 mame (Yusuke Endoh)

Issue #7134 has been updated by mame (Yusuke Endoh).

Status changed from Open to Assigned
Assignee set to kosaki (Motohiro KOSAKI)

Kosaki-san, do you know anything about this?

--
Yusuke Endoh

Bug #7134: Signal handling bug in Mac OS X
https://bugs.ruby-lang.org/issues/7134#change-32417

Author: auastro (Andy Kitchen)
Status: Assigned
Priority: Normal
Assignee: kosaki (Motohiro KOSAKI)
Category:
Target version: 1.9.3
ruby -v: ruby 1.9.3p194 (2012-04-20 revision 35410) [x86_64-darwin12]

On Mac OS X, running the attached program causes the exception below
about 1/10 times it is run.

$ ruby hup.rb
hup.rb:2:in write': deadlock; recursive locking (ThreadError) from hup.rb:2:in puts'
from hup.rb:2:in puts' from hup.rb:2:in block in '
from hup.rb:6:in call' from hup.rb:6:in write'
from hup.rb:6:in puts' from hup.rb:6:in puts'
from hup.rb:6:in `'

The expected output is:

In Hup Handler
Finished...

or

Finished...
In Hup Handler

My ruby is compiled with clang:

$ clang --version
Apple clang version 4.1 (tags/Apple/clang-421.11.65) (based on LLVM
3.1svn)
Target: x86_64-apple-darwin12.2.0
Thread model: posix

--
http://bugs.ruby-lang.org/

--
» Gustavo Anatoly Fernandes Velásquez Solís
» Information Systems - Bachelor
» Gmail:
» Skype: gustavoanatoly
» Cell Phone: +55 (83) 9918-1763

--
» Gustavo Anatoly Fernandes Velásquez Solís
» Information Systems - Bachelor
» Gmail:
» Skype: gustavoanatoly
» Cell Phone: +55 (83) 9918-1763

Updated by kosaki (Motohiro KOSAKI) over 11 years ago

  • Assignee changed from kosaki (Motohiro KOSAKI) to nobu (Nobuyoshi Nakada)
  • Target version changed from 1.9.3 to 2.6

Hi,

We are sorry, recursive lock is clearly unacceptable because recursive lock was not designed for trap handler. It lead to memory corruption.
When this issue was occur, C stack trace is below.

4 ruby-- 0x0000000100260061 rb_mutex_lock + 1217 (thread_pthread.c:197)
5 ruby-- 0x00000001002605c8 rb_mutex_synchronize + 24 (thread.c:4286)
6 ruby-- 0x000000010015e4f0 io_binwrite + 512 (io.c:1163)
7 ruby-- 0x000000010016156b io_write + 955 (io.c:1261)
8 ruby-- 0x0000000100254947 vm_call0_body + 743 (vm_eval.c:110)
9 ruby-- 0x000000010024a6c8 rb_funcall + 536 (vm_eval.c:49)
10 ruby-- 0x0000000100160eff rb_io_puts + 159 (io.c:6717)
11 ruby-- 0x0000000100254947 vm_call0_body + 743 (vm_eval.c:110)
12 ruby-- 0x0000000100249f20 rb_funcall2 + 256 (vm_eval.c:750)
13 ruby-- 0x0000000100256c55 vm_call_cfunc + 421 (vm_insnhelper.c:1442)
14 ruby-- 0x000000010025668f vm_call_method + 287 (vm_insnhelper.c:1689)
15 ruby-- 0x0000000100242f8c vm_exec_core + 10060 (insns.def:968)
16 ruby-- 0x000000010024f024 vm_exec + 100 (vm.c:1159)
17 ruby-- 0x000000010024e781 vm_invoke_proc + 161 (vm.c:680)
18 ruby-- 0x000000010024e6d0 rb_vm_invoke_proc + 32 (vm.c:696)
19 ruby-- 0x0000000100145695 proc_call + 117 (proc.c:567)
20 ruby-- 0x0000000100254947 vm_call0_body + 743 (vm_eval.c:110)
21 ruby-- 0x0000000100249f20 rb_funcall2 + 256 (vm_eval.c:750)
22 ruby-- 0x000000010024d486 rb_eval_cmd + 262 (vm_eval.c:1379)
23 ruby-- 0x000000010025c35a rb_threadptr_execute_interrupts_common + 330 (thread.c:1716)
24 ruby-- 0x000000010025cd44 rb_thread_io_blocking_region + 164 (thread_pthread.c:196)
25 ruby-- 0x000000010016fa3c io_binwrite_string + 60 (io.c:1116)
26 ruby-- 0x000000010013fd0d rb_ensure + 109 (eval.c:802)
27 ruby-- 0x000000010015e4f0 io_binwrite + 512 (io.c:1163)
28 ruby-- 0x000000010016156b io_write + 955 (io.c:1261)
29 ruby-- 0x0000000100254947 vm_call0_body + 743 (vm_eval.c:110)
30 ruby-- 0x000000010024a6c8 rb_funcall + 536 (vm_eval.c:49)
31 ruby-- 0x0000000100160eff rb_io_puts + 159 (io.c:6717)
32 ruby-- 0x0000000100254947 vm_call0_body + 743 (vm_eval.c:110)
33 ruby-- 0x0000000100249f20 rb_funcall2 + 256 (vm_eval.c:750)
34 ruby-- 0x0000000100256c55 vm_call_cfunc + 421 (vm_insnhelper.c:1442)
35 ruby-- 0x000000010025668f vm_call_method + 287 (vm_insnhelper.c:1689)
36 ruby-- 0x0000000100242f8c vm_exec_core + 10060 (insns.def:968)
37 ruby-- 0x000000010024f024 vm_exec + 100 (vm.c:1159)
38 ruby-- 0x000000010024f965 rb_iseq_eval_main + 405 (vm.c:1402)
39 ruby-- 0x000000010013efdf ruby_exec_internal + 111 (eval.c:251)
40 ruby-- 0x000000010013ef27 ruby_run_node + 71 (eval.c:303)
41 ruby-- 0x000000010010b17f main + 79 (main.c:36)
42 libdyld.dylib 0x00007fff8997e7e1 start + 1

Now, almost all IO hold a mutex durling io for serialization since r20144 (aka 1.9.2p0). Then, you can't
use any IOs in trap handler.

Ho hum. I think this spec is too user unfriendly and NEWS-1.9.2 seems not explain this incompatibility.
Nobu, Can you please explain current spec, your intention and your recommended trap handler writing style.

Updated by kosaki (Motohiro KOSAKI) over 11 years ago

Note: current documentation of Process.kill() uses puts() in trap handler. So, I personally don't think
that IO in trap handler is invalid code.

/*

  • call-seq:
  • Process.kill(signal, pid, ...)    -> fixnum
    
  • Sends the given signal to the specified process id(s) if pid is positive.
  • If pid is zero signal is sent to all processes whose group ID is equal
  • to the group ID of the process. signal may be an integer signal number or
  • a POSIX signal name (either with or without a +SIG+ prefix). If signal is
  • negative (or starts with a minus sign), kills process groups instead of
  • processes. Not all signals are available on all platforms.
  • pid = fork do
    
  •    Signal.trap("HUP") { puts "Ouch!"; exit }
    
  •    # ... do some work ...
    
  • end
    
  • # ...
    
  • Process.kill("HUP", pid)
    
  • Process.wait
    

Updated by kosaki (Motohiro KOSAKI) over 11 years ago

  • Target version changed from 2.6 to 2.0.0

sorry, fixed unintentional target version change.

Updated by gustavoanatoly (Gustavo Anatoly Fernandes Velasquez Solis) over 11 years ago

Hi, Kosaki.

Firstly, thanks for explain the reasons, and really that memory error was
happening in my tests. I'm interested to understand how to will fix it by
the team, I want learn a little bit more.

Thanks.

2012/11/5 kosaki (Motohiro KOSAKI)

Issue #7134 has been updated by kosaki (Motohiro KOSAKI).

Target version changed from next minor to 2.0.0

sorry, fixed unintentional target version change.


Bug #7134: Signal handling bug in Mac OS X
https://bugs.ruby-lang.org/issues/7134#change-32453

Author: auastro (Andy Kitchen)
Status: Assigned
Priority: Normal
Assignee: nobu (Nobuyoshi Nakada)
Category:
Target version: 2.0.0
ruby -v: ruby 1.9.3p194 (2012-04-20 revision 35410) [x86_64-darwin12]

On Mac OS X, running the attached program causes the exception below about
1/10 times it is run.

$ ruby hup.rb
hup.rb:2:in write': deadlock; recursive locking (ThreadError) from hup.rb:2:in puts'
from hup.rb:2:in puts' from hup.rb:2:in block in '
from hup.rb:6:in call' from hup.rb:6:in write'
from hup.rb:6:in puts' from hup.rb:6:in puts'
from hup.rb:6:in `'

The expected output is:

In Hup Handler
Finished...

or

Finished...
In Hup Handler

My ruby is compiled with clang:

$ clang --version
Apple clang version 4.1 (tags/Apple/clang-421.11.65) (based on LLVM 3.1svn)
Target: x86_64-apple-darwin12.2.0
Thread model: posix

--
http://bugs.ruby-lang.org/

--
» Gustavo Anatoly Fernandes Velásquez Solís
» Information Systems - Bachelor
» Gmail:
» Skype: gustavoanatoly
» Cell Phone: +55 (83) 9918-1763

Updated by kosaki (Motohiro KOSAKI) over 11 years ago

Attached my fixing proposal. This patch masks trap handler invocation during IO lock held.
But, I guess ko1 might dislike this approach because it slow down CHECK_INTS performance....

Updated by kosaki (Motohiro KOSAKI) over 11 years ago

  • Assignee changed from nobu (Nobuyoshi Nakada) to ko1 (Koichi Sasada)

ko1 suggested to use rb_thread_call_without_gvl2(). but it doesn't have a way to avoid CHECK_INTS completely.
I'm waiting ko1's rb_thread_call_without_gvl2 fix.

Updated by ko1 (Koichi Sasada) over 11 years ago

  • Assignee changed from ko1 (Koichi Sasada) to kosaki (Motohiro KOSAKI)

I fix gvl2!

Actions #13

Updated by kosaki (Motohiro KOSAKI) over 11 years ago

  • Status changed from Assigned to Closed
  • % Done changed from 0 to 100

This issue was solved with changeset r38398.
Andy, thank you for reporting this issue.
Your contribution to Ruby is greatly appreciated.
May Ruby be with you.


  • io.c (internal_write_func2): new helper function for rb_write_internal2().
  • io.c (rb_write_internal2): new function. it uses
    rb_thread_call_without_gvl2() instaed of rb_thread_io_blocking_region().
  • io.c (rb_binwrite_string): uses rb_write_internal2 instead of
    rb_write_internal. [Bug #7134]

Updated by kosaki (Motohiro KOSAKI) over 11 years ago

Fixed this by r38395 - r38401 series patches.

Actions #15

Updated by kosaki (Motohiro KOSAKI) over 11 years ago

  • Tracker changed from Bug to Backport
  • Project changed from Ruby master to Backport193
  • Status changed from Closed to Assigned
  • Assignee changed from kosaki (Motohiro KOSAKI) to usa (Usaku NAKAMURA)
  • Target version deleted (2.0.0)

Updated by usa (Usaku NAKAMURA) over 11 years ago

Is this list right?
r38395, r38396, r38397, r38398, r38399, r38400, r38401, r38402, r38422, r38447, r38448 and r38449
Are the fixes really completed?

And another problem.
I've not backported rb_thread_call_without_gvl2() from trunk yet.
I doubt the API is not stable.
Kosaki-san, do you think the orignal problem is very serious and necessary to backport many many changes from trunk?

Updated by usa (Usaku NAKAMURA) over 11 years ago

  • Status changed from Assigned to Feedback
  • Assignee changed from usa (Usaku NAKAMURA) to kosaki (Motohiro KOSAKI)

Updated by kosaki (Motohiro KOSAKI) over 11 years ago

  • Status changed from Feedback to Closed

Hmmm...
Ok, ok. I give up this claim. I hope nobody uses trap.

Updated by ioquatix (Samuel Williams) almost 2 years ago

Even without the change/fix proposed here, I cannot reproduce the original issue. Therefore, I assume something else has fixed this problem within the VM.

Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0