Bug #21876
closedAddrinfo.getaddrinfo(AF_UNSPEC) deadlocks after fork on macOS for IPv4-only hosts
Description
Summary¶
On macOS, Addrinfo.getaddrinfo(host, service, Socket::AF_UNSPEC, Socket::SOCK_STREAM) can deadlock in forked child processes when the host has no AAAA (IPv6) DNS records and the parent process previously resolved the same host.
This happened to me when using an HTTP library to acquire an OAuth access token in a Rails initializer and then the process was forked, then a separate call was made to the same host in the forked process.
Environment¶
- macOS (tested on arm64-darwin24 and arm64-darwin25, Apple Silicon)
- Ruby 3.4.7, 3.4.8
- The issue is probabilistic — frequency varies by environment but is highly reproducible under sustained DNS activity
Reproduction¶
Minimal example:
require "socket"
require "timeout"
# Parent resolves an IPv4-only host (no AAAA records)
Addrinfo.getaddrinfo("httpbin.org", "https", Socket::AF_UNSPEC, Socket::SOCK_STREAM)
pid = fork do
begin
Timeout.timeout(5) do
Addrinfo.getaddrinfo("httpbin.org", "https", Socket::AF_UNSPEC, Socket::SOCK_STREAM)
end
puts "Child: OK"
rescue Timeout::Error
puts "Child: DEADLOCK — getaddrinfo hung for 5s"
end
end
Process.waitpid(pid)
The issue is probabilistic — a single invocation may or may not deadlock. The attached script runs 50 trials each for several variants to demonstrate the pattern. Deadlock may not happen on the first run, but if you run it several times, you should see at least a single deadlock in Test 2, if not deadlock of all results in Test 1 and Test 2.
See attachment - ruby_getaddrinfo_fork_bug.rb
Typical output:
Test 1 (single IPv4-only host): 20/20 deadlocked
Test 2 (multi-host warmup): 20/20 deadlocked
Test 3 (dual-stack host control): 0/20 deadlocked
Test 4 (AF_INET workaround): 0/20 deadlocked
Context¶
The deadlock occurs when ALL of these conditions hold:
- macOS (not observed on Linux)
- Parent called
getaddrinfo(host, AF_UNSPEC)for a host with no AAAA (IPv6) records - Child calls
getaddrinfofor the same host withAF_UNSPEC
Not affected:
- Hosts with AAAA records (dual-stack) — e.g.,
www.google.com,rubygems.org - Using
Socket::AF_INETinstead ofSocket::AF_UNSPEC - Hosts the parent never resolved
| Host | AAAA records | Child deadlocks? |
|---|---|---|
| httpbin.org | None | Yes |
| www.github.com | None | Yes |
| api.github.com | None | Yes |
| stackoverflow.com | None | Yes |
| www.google.com | Yes | No |
| rubygems.org | Yes | No |
| example.com | Yes | No |
| www.cloudflare.com | Yes | No |
Potential Root Cause¶
As I understand it, on macOS, getaddrinfo communicates with the mDNSResponder system daemon via Mach IPC ports. When getaddrinfo(AF_UNSPEC) queries a host with no AAAA records, the negative AAAA result appears to be cached via Mach port state. After fork(), the child process inherits the address space (including references to this cached state) but does not inherit the Mach port connections to mDNSResponder. When the child calls getaddrinfo for the same host, it encounters the stale cache entry and deadlocks trying to communicate over the invalidated Mach port.
Hosts with positive AAAA results are not affected, presumably because their cache entries do not require re-contacting mDNSResponder in the same code path.
Feature #20590¶
Ruby 3.4's fork safety improvements (Feature #20590) added a read-write lock around getaddrinfo to prevent fork() while a getaddrinfo call is actively running. However, this does not address the issue reported here — the problem is not about forking during a getaddrinfo call, but about stale mDNSResponder Mach port state that is inherited by the child process after getaddrinfo has completed in the parent.
Files
Updated by luke-gru (Luke Gruber) 7 days ago
· Edited
I'm getting a segfault when running your minimal reproduction script on my Macbook Pro (Darwin Mac 25.2.0 Darwin Kernel Version 25.2.0 (Apple Silicon)).
I get the segfault when compiling under all 3 GETADDRINFO_IMPL implementations that ruby uses. This looks to be a bug in Darwin and not Ruby, although maybe we can work around it. Have you sent this bug report to Apple? If not, I'll try coming up with a reproduction in pure C that we can send to them.
Edit: I just noticed this is a similar report to https://bugs.ruby-lang.org/issues/21790, which was marked as closed (Third-party issue). The conclusion was that this is an Apple bug, and using AF_INET works as expected so please use that if you can.
Updated by luke-gru (Luke Gruber) 7 days ago
- Status changed from Open to Third Party's Issue