Bug #15763
closed
Segmentation fault in timeout.rb / sleep
Added by stan-envato (Stan Pitucha) over 5 years ago.
Updated about 5 years ago.
Status:
Third Party's Issue
ruby -v:
ruby 2.6.2p47 (2019-03-13 revision 67232) [x86_64-darwin18]
[ruby-core:92239]
Description
I'm running into crashes on both ruby 2.6.1 and 2.6.2 (2.5.x is all good).
I'm on OSX / mojave with ruby installed via rbenv / ruby-build. Confirmed on two different machines.
The crash happens through the parallel gem, but it happens even if the number of processes is reduced to 1.
Short summary:
-- Control frame information -----------------------------------------------
c:0003 p:---- s:0011 e:000010 CFUNC :sleep
c:0002 p:0025 s:0006 e:000005 BLOCK /Users/viraptor/.rbenv/versions/2.6.2/lib/ruby/2.6.0/timeout.rb:86 [FINISH]
c:0001 p:---- s:0003 e:000002 (none) [FINISH]
-- Ruby level backtrace information ----------------------------------------
/Users/viraptor/.rbenv/versions/2.6.2/lib/ruby/2.6.0/timeout.rb:86:in block (2 levels) in timeout' /Users/viraptor/.rbenv/versions/2.6.2/lib/ruby/2.6.0/timeout.rb:86:in
sleep'
The rest is in the logs.
Files
Additionally, the issue does not seem to happen on every build. If I rebuild the same version of ruby, the issue may go away. (until another few rebuilds)
This might be the same issue as:
The common points are:
- macOS (darwin17 or 18)
- uses multiple threads
- segfault in getaddrinfo
I could be wrong, but I suspect a bug of macOS's getaddrinfo.
Can you show a short program that causes the segfault?
I came here after seeing the same segfault in timeout.rb / CFUNC :sleep on ruby 2.6.2 on MacOS with a Rails project running with Puma and 2 worker threads.
Installed 2.6.3 and now seeing the segfault coming from pg - but interestingly while opening a connection to the db:
-- C level backtrace information -------------------------------------------
/Users/agranov/.rvm/rubies/ruby-2.6.3/lib/libruby.2.6.dylib(rb_vm_bugreport+0x82) [0x10bd87182]
/Users/agranov/.rvm/rubies/ruby-2.6.3/lib/libruby.2.6.dylib(rb_bug_context+0x1d3) [0x10bbd31f3]
/Users/agranov/.rvm/rubies/ruby-2.6.3/lib/libruby.2.6.dylib(sigsegv+0x51) [0x10bceb591]
/usr/lib/system/libsystem_platform.dylib(_sigtramp+0x1d) [0x7fff5827db5d]
/usr/lib/system/libsystem_trace.dylib(_os_log_preferences_refresh+0x4c) [0x7fff582a090a]
/usr/lib/system/libsystem_trace.dylib(0x7fff582a113d) [0x7fff582a113d]
/usr/lib/system/libsystem_info.dylib(si_destination_compare_statistics+0x903) [0x7fff581b9843]
/usr/lib/system/libsystem_info.dylib(0x7fff581b81a5) [0x7fff581b81a5]
/usr/lib/system/libsystem_info.dylib(0x7fff581b7d3f) [0x7fff581b7d3f]
/usr/lib/system/libsystem_info.dylib(0x7fff581966df) [0x7fff581966df]
/usr/lib/system/libsystem_c.dylib(_isort+0xc1) [0x7fff58140e5b]
/usr/lib/system/libsystem_c.dylib(0x7fff58140d88) [0x7fff58140d88]
/usr/lib/system/libsystem_info.dylib(0x7fff5818df2d) [0x7fff5818df2d]
/usr/lib/system/libsystem_info.dylib(0x7fff5818c885) [0x7fff5818c885]
/usr/lib/system/libsystem_info.dylib(0x7fff5818bf77) [0x7fff5818bf77]
/usr/lib/system/libsystem_info.dylib(0x7fff5818be7d) [0x7fff5818be7d]
/usr/lib/libpq.5.dylib(connectDBStart+0x1d4) [0x7fff57094af2]
/usr/lib/libpq.5.dylib(PQconnectStart+0x3a) [0x7fff570941de]
/usr/lib/libpq.5.dylib(PQconnectdb+0xb) [0x7fff57094181]
Reducing the Puma workers to a single one, I've yet to see a segfault.
Nix that: single Puma worker makes no difference. Back to segfault in timeout.rb.
- Related to Bug #13646: Segmentation fault with postgresql_adapter in Rails added
I think mame is correct that this is related to Mac OS X getaddrinfo. We have at least 5 separate bug reports for very similar issues. All segmentation faults with similar addresses, all on Mac OS X and either definitely or probably inside getaddrinfo:
- #15763: 0x00000001081bfa52 (definitely in getaddrinfo, this issue)
- #15490: 0x000000010f7e1a3a (definitely in getaddrinfo, during ssh connection)
- #15639: 0x000000010e82ca3a (definitely in getaddrinfo, during postgresql connection)
- #15749: 0x000000010d9bda7c (definitely in getaddrinfo, during postgresql connection)
- #13646: 0x000000010abfaa3a (probably in getaddrinfo, during postgresql connection)
In most of these cases, getaddrinfo isn't even called directly by Ruby, it is called by C code (e.g. libpq). I'm not sure Third Party's Issue is appropriate for these issues, but I'm not sure there is anything we can do to fix it.
A valid workaround until this is fixed in MacOS - if you can get away without ipv6 - is to have your web server like Puma bind to an ipv4 address like -b 127.0.0.1
or -b 0.0.0.0
upon boot and then all is :rainbows:.
- Status changed from Open to Third Party's Issue
Also available in: Atom
PDF
Like0
Like0Like0Like0Like0Like0Like0Like0Like0