Feature #16381
Updated by kirs (Kir Shatrov) about 5 years ago
This is a follow-up to https://bugs.ruby-lang.org/issues/15553 [[https://bugs.ruby-lang.org/issues/15553]] and a successor of https://github.com/ruby/ruby/pull/1806 (the credit to Carl Hörberg). Unlike https://github.com/ruby/ruby/pull/1806, this patch introduces a separate `resolv_timeout` Net::HTTP would pass to `Socket.tcp`. The idea to have it as a separate value (vs reusing open_timeout) was suggested by Alan Wu. It's helpful in case specifies open_timeout: 1, DNS resolv takes 0.9s and opening TCP connection takes 0.9s, and the total wait time is 1.8s even though the allowed timeout was 1s. This patch not only makes DNS timeout customizable, but also fixes a bug when wrapping `TCPSocket.open` into whatever seconds `Timeout.timeout` would still take 10 seconds because of the nature of blocking resolv operation on many systems (here's a gist to reproduce on Linux: https://gist.github.com/kirs/5f711099b23ddae7a87ebb082ce43f59). This problem is not hypothetical, it's something we've been seeing in production fairly often: even with open/read timeouts on Net::HTTP as low as a second, the Ruby process would still be blocked for 10s (system's resolv timeout) in case of DNS issues. And on web servers with blocking IO (e.g. Unicorn) this would cause the loss of capacity.