The timeout option for Addrinfo.getaddrinfo is not reliable on Ruby 2.7.2

Added by smcgivern (Sean McGivern) about 2 months ago. Updated 18 days ago.

#15553 introduced a timeout option for Addrinfo.getaddrinfo, which uses getaddrinfo_a internally. It appears this has since been reverted in the development branch via (due to #17220 maybe; I didn't quite follow the discussion).

However, Ruby 2.7.2 is still not reliable, even without forking:

$ ruby -v
ruby 2.7.2p137 (2020-10-01 revision 5445e04352) [x86_64-linux]
$ ruby -e "require 'resolv'; 10000.times { |i| p [i, Addrinfo.getaddrinfo('2130706433', 80, nil, :STREAM, timeout: 5)] }" | tail
Traceback (most recent call last):
    3: from -e:1:in `<main>'
    2: from -e:1:in `times'
    1: from -e:1:in `block in <main>'
-e:1:in `getaddrinfo': getaddrinfo_a: All requests done (SocketError)
[1473, [#<Addrinfo: TCP (2130706433)>]]
[1474, [#<Addrinfo: TCP (2130706433)>]]
[1475, [#<Addrinfo: TCP (2130706433)>]]
[1476, [#<Addrinfo: TCP (2130706433)>]]
[1477, [#<Addrinfo: TCP (2130706433)>]]
[1478, [#<Addrinfo: TCP (2130706433)>]]
[1479, [#<Addrinfo: TCP (2130706433)>]]
[1480, [#<Addrinfo: TCP (2130706433)>]]
[1481, [#<Addrinfo: TCP (2130706433)>]]
[1482, [#<Addrinfo: TCP (2130706433)>]]

This is on a VirtualBox VM and fails fairly quickly. On a 'real' Linux system, I need to try a few times or bump the number of iterations, but it also fails consistently with consecutive requests. I'm choosing 2130706433 (the decimal representation of as that's what our test suite uses, and that's what failed when I tried to use the timeout option.

On Ruby 3.0.0-dev this does not fail due to the aforementioned revert, but should this also be removed from Ruby 2.7 until it's ready?

Updated by stanhu (Stan Hu) 18 days ago

I agree that this should be removed from Ruby 2.7. In, we were able to trigger a seg fault by doing this:

require 'socket'

loop do
  Addrinfo.getaddrinfo('localhost', nil, timeout: 0)
rescue SocketError

