Project

General

Profile

Bug #15490

socket.rb - recurring segmentation faults

Added by matthew.oriordan (Matthew O'Riordan) 4 months ago. Updated about 1 month ago.

Status:
Open
Priority:
Normal
Assignee:
-
Target version:
-
ruby -v:
ruby 2.6.0p0 (2018-12-25 revision 66547) [x86_64-darwin18]
[ruby-core:90833]

Description

With Ruby 2.5.3p105 and now with Ruby 2.6.0 following our recent upgrade, we are sadly still seeing reasonably frequent segmentation faults from Ruby, specifically within socket.rb

Looking in socket.rb, it seems it's related to the address lookup:

Addrinfo.getaddrinfo(nodename, service, family, socktype, protocol, flags).each(&block)

Segfault report below in full. Attached are diagnostic reports too. If there is anything I can do to help reproduce I will, however sadly I have never been able to reproduce reliably, yet sadly it happens once every few days.


Files

ruby_2018-12-31-032126-2_MacBook-Pro.crash (46.8 KB) ruby_2018-12-31-032126-2_MacBook-Pro.crash matthew.oriordan (Matthew O'Riordan), 12/31/2018 03:46 AM
ruby_2018-12-31-032126-3_MacBook-Pro.crash (46.8 KB) ruby_2018-12-31-032126-3_MacBook-Pro.crash matthew.oriordan (Matthew O'Riordan), 12/31/2018 03:46 AM
ruby_2018-12-31-032126-1_MacBook-Pro.crash (46.8 KB) ruby_2018-12-31-032126-1_MacBook-Pro.crash matthew.oriordan (Matthew O'Riordan), 12/31/2018 03:46 AM
ruby_2018-12-31-032125_MacBook-Pro.crash (46.8 KB) ruby_2018-12-31-032125_MacBook-Pro.crash matthew.oriordan (Matthew O'Riordan), 12/31/2018 03:46 AM
bug-15490.log (833 KB) bug-15490.log nobu (Nobuyoshi Nakada), 12/31/2018 08:47 AM

Related issues

Has duplicate Ruby trunk - Bug #15639: [BUG] Segmentation fault at 0x000000010e82ca3aOpenActions

History

Updated by nobu (Nobuyoshi Nakada) 4 months ago

Always it happens here, though I couldn't find the source of si_destination_compare, it may be a problem in libsystem_info.dylib.

7   ???                             0x00007fc6cddeaac0 0 + 140491834174144
8   libsystem_trace.dylib           0x00007fff6e31adb4 os_log_type_enabled + 627
9   libsystem_info.dylib            0x00007fff6e23305b si_destination_compare_statistics + 1659
10  libsystem_info.dylib            0x00007fff6e231bf3 si_destination_compare_internal + 707
11  libsystem_info.dylib            0x00007fff6e231762 si_destination_compare + 530
12  libsystem_info.dylib            0x00007fff6e20f95f _gai_addr_sort + 111
13  libsystem_c.dylib               0x00007fff6e1b9a0f _isort + 193
14  libsystem_c.dylib               0x00007fff6e1b993c _qsort + 2159
15  libsystem_info.dylib            0x00007fff6e207135 _gai_sort_list + 789
16  libsystem_info.dylib            0x00007fff6e205b88 si_addrinfo + 2040
17  libsystem_info.dylib            0x00007fff6e205262 _getaddrinfo_internal + 242
18  libsystem_info.dylib            0x00007fff6e20515d getaddrinfo + 61

Updated by matthew.oriordan (Matthew O'Riordan) 4 months ago

Is there something I can do to help with the source of si_destination_compare, and the problem you believe is related to libsystem_info.dylib?

Updated by jessebs (Jesse Bowes) 3 months ago

I have run into a similar issue using Ruby 2.5.1 but unfortunately don't have an easy way to reproduce.

A couple of things that help mitigate it (and may be useful for finding the actual issue):

getaddrinfo is in the backtrace and this is happening around some network code for me. I found that using an IP address instead of hostname makes the issue go away.

Another option that I have found is that around the code giving problems, turning off Garbage Collection will make it go away as well (GC.disable).

#4

Updated by nobu (Nobuyoshi Nakada) about 2 months ago

  • Has duplicate Bug #15639: [BUG] Segmentation fault at 0x000000010e82ca3a added

Updated by zormandi (Zoltan Ormandi) about 1 month ago

We're seeing this issue as well, on Ruby 2.6.1. For us, it occurs towards the end of a fairly large test suite when running one of our legacy Cucumber tests. When we only run the Cucumber section of our test suite (not the whole thing) then the issue does not occur. Also, it does not happen on our CI server which makes me suspect that this might be an OSX-exclusive problem - we're only seeing it on our Macbooks.

The test that triggers the crash starts up a fake web server using WEBrick to simulate one of our services. It binds to 'http://localhost:42638' but the suggestion of using an IP address instead of a hostname didn't solve the problem for us; it still occurs if we change the binding to 'http://127.0.0.1:42638'.

Let me know if there's any information that could help (other than a reproduce script, which I obviously cannot provide) - it would be great to get rid of this bug.

UPDATE

Unfortunately, I was wrong. The issue does sometimes occur even when only the Cucumber section of our test suite is being executed. Also, turning off the GC didn't help either.

Updated by PikachuEXE (Pikachu Leung) about 1 month ago

I might got a similar issue with 2.6.2 (also crash at os_log_type_enabled + 627)
https://bugs.ruby-lang.org/issues/15623#note-2

See update #2

Also available in: Atom PDF