Project

General

Profile

Actions

Bug #17561

closed

The timeout option for Addrinfo.getaddrinfo is not reliable on Ruby 2.7.2

Added by smcgivern (Sean McGivern) about 3 years ago. Updated over 2 years ago.


Description

#15553 introduced a timeout option for Addrinfo.getaddrinfo, which uses getaddrinfo_a internally. It appears this has since been reverted in the development branch via https://github.com/ruby/ruby/commit/5d8bcc4870601ab1ee05346346f241d4a805aac9 (due to #17220 maybe; I didn't quite follow the discussion).

However, Ruby 2.7.2 is still not reliable, even without forking:

$ ruby -v
ruby 2.7.2p137 (2020-10-01 revision 5445e04352) [x86_64-linux]
$ ruby -e "require 'resolv'; 10000.times { |i| p [i, Addrinfo.getaddrinfo('2130706433', 80, nil, :STREAM, timeout: 5)] }" | tail
Traceback (most recent call last):
	3: from -e:1:in `<main>'
	2: from -e:1:in `times'
	1: from -e:1:in `block in <main>'
-e:1:in `getaddrinfo': getaddrinfo_a: All requests done (SocketError)
[1473, [#<Addrinfo: 127.0.0.1:80 TCP (2130706433)>]]
[1474, [#<Addrinfo: 127.0.0.1:80 TCP (2130706433)>]]
[1475, [#<Addrinfo: 127.0.0.1:80 TCP (2130706433)>]]
[1476, [#<Addrinfo: 127.0.0.1:80 TCP (2130706433)>]]
[1477, [#<Addrinfo: 127.0.0.1:80 TCP (2130706433)>]]
[1478, [#<Addrinfo: 127.0.0.1:80 TCP (2130706433)>]]
[1479, [#<Addrinfo: 127.0.0.1:80 TCP (2130706433)>]]
[1480, [#<Addrinfo: 127.0.0.1:80 TCP (2130706433)>]]
[1481, [#<Addrinfo: 127.0.0.1:80 TCP (2130706433)>]]
[1482, [#<Addrinfo: 127.0.0.1:80 TCP (2130706433)>]]

This is on a VirtualBox VM and fails fairly quickly. On a 'real' Linux system, I need to try a few times or bump the number of iterations, but it also fails consistently with consecutive requests. I'm choosing 2130706433 (the decimal representation of 127.0.0.1) as that's what our test suite uses, and that's what failed when I tried to use the timeout option.

On Ruby 3.0.0-dev this does not fail due to the aforementioned revert, but should this also be removed from Ruby 2.7 until it's ready?

Updated by stanhu (Stan Hu) about 3 years ago

I agree that this should be removed from Ruby 2.7. In https://gitlab.com/gitlab-org/gitlab/-/issues/11951#note_510756578, we were able to trigger a seg fault by doing this:

require 'socket'

loop do
  Addrinfo.getaddrinfo('localhost', nil, timeout: 0)
rescue SocketError
end

Updated by ko1 (Koichi Sasada) almost 3 years ago

confirmed on my machine:

$ ruby -ve "require 'resolv'; 10000.times { |i| p [i, Addrinfo.getaddrinfo('2130706433', 80, nil, :STREAM, timeout: 0)] }"
ruby 2.7.3p183 (2021-04-05 revision 6847ee089d) [x86_64-linux]
*** stack smashing detected ***: terminated
Aborted

Updated by mame (Yusuke Endoh) almost 3 years ago

  • Status changed from Open to Assigned
  • Assignee set to Glass_saga (Masaki Matsushita)

@Glass_saga (Masaki Matsushita) Could you handle this?

Updated by Glass_saga (Masaki Matsushita) almost 3 years ago

  • Backport changed from 2.5: UNKNOWN, 2.6: UNKNOWN, 2.7: UNKNOWN, 3.0: UNKNOWN to 2.5: DONTNEED, 2.6: DONTNEED, 2.7: REQUIRED, 3.0: DONTNEED

Although I can't reproduce the issue on my machine (I tried several times), I agree that this should be removed from 2.7.
I will create a patch.

Updated by usa (Usaku NAKAMURA) almost 3 years ago

About 2.7, to make raise ArgumentError (or other exception) is not acceptable because it's *stable branch.
I strongly hope to fix this problem at master branch and the fix can be backportable, but at this time, we should simply ignore timeout: argument and note it on documentation.

Updated by jeremyevans0 (Jeremy Evans) over 2 years ago

usa (Usaku NAKAMURA) wrote in #note-5:

About 2.7, to make raise ArgumentError (or other exception) is not acceptable because it's *stable branch.
I strongly hope to fix this problem at master branch and the fix can be backportable, but at this time, we should simply ignore timeout: argument and note it on documentation.

I sent a pull request to disable the timeout option: https://github.com/ruby/ruby/pull/4618

I couldn't see where the timeout option was documented, so I didn't modify the documentation.

I don't think we'll be able to backport a fix to the issue. The use of getaddrinfo_a was completely removed in 5d8bcc4870601ab1ee05346346f241d4a805aac9.

Updated by usa (Usaku NAKAMURA) over 2 years ago

  • Backport changed from 2.5: DONTNEED, 2.6: DONTNEED, 2.7: REQUIRED, 3.0: DONTNEED to 2.5: DONTNEED, 2.6: DONTNEED, 2.7: DONE, 3.0: DONTNEED

merged into ruby_2_7 at d9ff8b3e86a03499a5c6bc36fae1592914a25b9c

Actions #8

Updated by jeremyevans0 (Jeremy Evans) over 2 years ago

  • Status changed from Assigned to Closed
Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0Like0Like0Like0Like0