Project

General

Profile

Feature #16476

Updated by kirs (Kir Shatrov) about 1 year ago

It seems like the blocking syscall done by `Socket.getaddrinfo` blocks Ruby VM in a way that Timeout.timeout has no effect. 
 See reproduction steps in getaddrinfo_interrupt.rb (https://gist.github.com/kirs/00c02ef92e0418578135fe0a6cbd3d7d). attached. This affects all modern Ruby versions, including the latest 2.7.0. 

 Combined with default 10s resolv timeout on many Linux systems, this can have a very noticeable effect on production Ruby apps being not resilient to slow DNS resolutions, and being unable to fail fast even with `Timeout.timeout`. 

 While https://bugs.ruby-lang.org/issues/15553 improves the situation for `Addrinfo.getaddrinfo`, `Socket.getaddrinfo` is still blocking the VM and Timeout has no effect. 

 I'd like to discuss what could be done to make that call non-blocking for threads in Ruby VM. 

 **UPD:** looking closer, I can see that `Socket.getaddrinfo("www.ruby-lang.org", "http")` and `Addrinfo.getaddrinfo("www.ruby-lang.org", "http")` call non-interruptible `getaddrinfo`, while `Addrinfo.getaddrinfo("www.ruby-lang.org", "http", timeout: 10)` calls `getaddrinfo_a`, which is interruptible: 

 ``` ruby 
 # interrupts as expected 
 Timeout.timeout(1) do 
   Addrinfo.getaddrinfo("www.ruby-lang.org", "http", timeout: 10) 
 end 
 ``` 

 I'd maybe suggest that we try to *always* use `getaddrinfo_a` when it's available, including in `Socket.getaddrinfo`. What downsides that would have? 
 I'd be happy to work on a patch.

Back