Project

General

Profile

Actions

Feature #16476

open

Socket.getaddrinfo cannot be interrupted by Timeout.timeout

Added by kirs (Kir Shatrov) over 1 year ago. Updated 4 months ago.

Status:
Open
Priority:
Normal
Target version:
[ruby-core:96642]

Description

It seems like the blocking syscall done by Socket.getaddrinfo blocks Ruby VM in a way that Timeout.timeout has no effect.
See reproduction steps in getaddrinfo_interrupt.rb (https://gist.github.com/kirs/00c02ef92e0418578135fe0a6cbd3d7d). This affects all modern Ruby versions, including the latest 2.7.0.

Combined with default 10s resolv timeout on many Linux systems, this can have a very noticeable effect on production Ruby apps being not resilient to slow DNS resolutions, and being unable to fail fast even with Timeout.timeout.

While https://bugs.ruby-lang.org/issues/15553 improves the situation for Addrinfo.getaddrinfo, Socket.getaddrinfo is still blocking the VM and Timeout has no effect.

I'd like to discuss what could be done to make that call non-blocking for threads in Ruby VM.

UPD: looking closer, I can see that Socket.getaddrinfo("www.ruby-lang.org", "http") and Addrinfo.getaddrinfo("www.ruby-lang.org", "http") call non-interruptible getaddrinfo, while Addrinfo.getaddrinfo("www.ruby-lang.org", "http", timeout: 10) calls getaddrinfo_a, which is interruptible:

# interrupts as expected
Timeout.timeout(1) do
  Addrinfo.getaddrinfo("www.ruby-lang.org", "http", timeout: 10)
end

I'd maybe suggest that we try to always use getaddrinfo_a when it's available, including in Socket.getaddrinfo. What downsides that would have?
I'd be happy to work on a patch.


Related issues

Related to Ruby master - Feature #16381: Accept resolv_timeout in Net::HTTPOpenActions
Related to Ruby master - Feature #17134: Add resolv_timeout to TCPSocketOpenActions
Actions

Also available in: Atom PDF