Project

General

Profile

Actions

Bug #19144

closed

Ruby should set AI_V4MAPPED | AI_ADDRCONFIG getaddrinfo flags by default

Added by kjtsanaktsidis (KJ Tsanaktsidis) over 1 year ago. Updated 3 months ago.

Status:
Closed
Assignee:
-
Target version:
-
[ruby-core:110870]

Description

Currently, DNS lookups made with getaddrinfo from Ruby (i.e. not from the Resolv module) cause both A and AAAA DNS requests to be made, even on systems that don’t actually have an IPv6 address that could possibly make the AAAA response useful. I wouldn’t really care about this, normally, but glibc has a bug (https://bugs.launchpad.net/ubuntu/+source/glibc/+bug/1961697) which can cause a 5-second delay in DNS lookups when both A and AAAA records are queried in parallel. This bug is fixed in glibc upstream but still present in some LTS linux distros (Ubuntu 18.04 and 20.04 at least), so I think it’s worthwhile to try and work around it in circumstances where the AAAA request is pointless anyway.

The dual A/AAAA lookup happens because whenever Ruby calls getaddrinfo to perform DNS lookups, it always sets hints, and sets hints->ai_flags to zero by default unless flags are specified by the caller (e.g. AI_PASSIVE is set when binding a TCP server socket in TCPServer.new).

This matches the default value of ai_flags specified by POSIX, which is zero. However, glibc behaves differently. When glibc’s getaddrinfo function is called with NULL for the hints parameter, it defaults the ai_flags value to (AI_V4MAPPED | AI_ADDRCONFIG). The manpage (from the Linux man-pages project - https://man7.org/linux/man-pages/man3/getaddrinfo.3.html) claims “this is an improvement on the standard” (although I couldn’t find this mentioned in the glibc manual itself).

Of course, we’re not actually ever calling getaddrinfo with NULL hints; so, we never actually use these flags on glibc systems (unless they’re explicitly specified by the caller).

My proposal is that we should change Ruby to set these two flags by default, when they’re available, in the following circumstances:

  • In all calls made internally to rsock_getaddrinfo as a result of socket functions like TCPSocket.new, UDPSocket.new, etc.
  • EXCEPT when AI_PASSIVE is also set (i.e. when we’re trying to get an address to bind for listener socket - see below)
  • In calls made to rsock_getaddrinfo as a direct result of calling Addrinfo.getaddrinfo from Ruby with nil flags
  • EXCEPT calls to Addrinfo.getaddrinfo where explicit flags are provided

Both of these seem like something you would almost always want to be doing in any outgoing connection scenario:

  • AI_V4MAPPED ensures that, if AF_INET6 is explicitly specified as the desired protocol, and there is no AAAA record in DNS, that any A record that is present gets converted to an IPv4-mapped IPv6 address so it can be used e.g. with NAT64.
  • AI_ADDRCONFIG ensures that, if a machine has no IPv6 address, it doesn’t bother making an AAAA lookup that will return IPv6 addresses that can’t actually be used for anything (and vice versa for IPv4).

The reason why we wouldn’t want to set AI_ADDRCONFIG in circumstances where Ruby currently sets AI_PASSIVE is that loopback addresses are not considered in deciding if a system has an IPv4/IPv6 address. Conceivably, you might want to bind to a ::1 loopback address, and allow other processes on the same machine to connect to that.

Does changing this default sound reasonable? If so I can prepare a patch. Another option I considered is doing this only when Ruby is built against glibc (so that other system behaviour is most closely matched).

Updated by kjtsanaktsidis (KJ Tsanaktsidis) over 1 year ago

A gentle poke to see if anybody has some thoughts on this?

Updated by akr (Akira Tanaka) about 1 year ago

  • Status changed from Open to Feedback

I feel AI_ADDRCONFIG is good if the result addresses are used immediately for making a connection.

But getaddrinfo can be used just for getting DNS information.
AI_ADDRCONFIG is not suitable for this situation.

I don't understand why AI_V4MAPPED is useful.
Also, some systems, such as NetBSD, seems doesn't have AI_V4MAPPED.
https://man.netbsd.org/NetBSD-9.3/getaddrinfo.3
Using AI_V4MAPPED introduces incompatibility.

Ruby has several methods to invoke getaddrinfo() and connect() internally, such as TCPSocket.new.
How about we specify AI_ADDRCONFIG for getaddrinfo invocations in such methods?
This avoids the problem (useless AAAA query) and
doesn't affect applications that invoke of getaddrinfo (possibly it may have a problem with AI_ADDRCONFIG).

Updated by kjtsanaktsidis (KJ Tsanaktsidis) about 1 year ago

Thank you for having a look at this!

Ruby has several methods to invoke getaddrinfo() and connect() internally, such as TCPSocket.new.
How about we specify AI_ADDRCONFIG for getaddrinfo invocations in such methods?

I'm OK with doing just this, and not changing direct calls to Addrinfo.getaddrinfo. You're right, it's going to solve 99% of the problems and avoids any potential compatibility issue

I don't understand why AI_V4MAPPED is useful.

I did a bit more research into this. Actually what I said in the original issue about NAT64 is wrong, v4 mapped v6 addresses have nothing to do with NAT64.

What this flag does actually is:

  • When making a call to getaddrinfo with both AF_INET6 and AI_V4MAPPED,
  • If there is no AAAA record for a name,
  • And there is an A record for a name,
  • Return an "IPv4-mapped IPv6 address", which is an IPv6 address prefixed with ::FFFF and then the four bytes of the IPv4 address at the end e.g. ::FFFF:1.2.3.4

The point of the IPv4-mapped IPv6 address actually has nothing to do with NAT64. Rather, when calling connect(2) on such an IPv6 address, if the host actually does have an IPv4 address as well, it will make the connection with the IPv4 stack. The purpose of this, it seems, is to allow applications to be written to only handle IPv6, and they'll transparently get IPv4 support for free.

I don't think Ruby actually needs this flag - it defaults to making the request with AF_UNSPEC and can handle getting either IPv4 or IPv6 addresses out of getaddrinfo correctly. In fact, the only way for any of the socket connect methods to pass a specific address family in here is UDPSocket.new(Socket::AF_INET6).connect('hostname', port_number). If this actually made an IPv4 connection because getaddrinfo returned an IPv4-mapped IPv6 address, I think that would be very confusing.

So, I think you're right - we should not set AI_V4MAPPED by default.

Also, some systems, such as NetBSD, seems doesn't have AI_V4MAPPED.

I would add feature checks for these flags in socket's extconf.rb i think.


Thanks again for your feedback. I'll try and send a PR later this week which defaults AI_ADDRCONFIG to on when getaddrinfo is called from inside the socket connection methods (but NOT when called explicitly with Socket.getaddrinfo et al).

Updated by kjtsanaktsidis (KJ Tsanaktsidis) about 1 year ago

OK, I opened https://github.com/ruby/ruby/pull/7295 with those changes. Thanks again!

Updated by kjtsanaktsidis (KJ Tsanaktsidis) about 1 year ago

@akr (Akira Tanaka) could you take a look at my PR when you get a chance? I think I addressed your feedback, please let me know if I have misunderstood!

Actions #6

Updated by jeremyevans0 (Jeremy Evans) 4 months ago

  • Status changed from Feedback to Open

Updated by akr (Akira Tanaka) 4 months ago

kjtsanaktsidis (KJ Tsanaktsidis) wrote in #note-5:

@akr (Akira Tanaka) could you take a look at my PR when you get a chance? I think I addressed your feedback, please let me know if I have misunderstood!

It seems fine.

I agree that we remove the test, test_ai_addrconfig, because it is too complicated.

Actions #8

Updated by Anonymous 3 months ago

  • Status changed from Open to Closed

Applied in changeset git|d2ba8ea54a4089959afdeecdd963e3c4ff391748.


Set AI_ADDRCONFIG when making getaddrinfo(3) calls for outgoing conns (#7295)

When making an outgoing TCP or UDP connection, set AI_ADDRCONFIG in the
hints we send to getaddrinfo(3) (if supported). This will prompt the
resolver to NOT issue A or AAAA queries if the system does not
actually have an IPv4 or IPv6 address (respectively).

This makes outgoing connections marginally more efficient on
non-dual-stack systems, since we don't have to try connecting to an
address which can't possibly work.

More importantly, however, this works around a race condition present
in some older versions of glibc on aarch64 where it could accidently
send the two outgoing DNS queries with the same DNS txnid, and get
confused when receiving the responses. This manifests as outgoing
connections sometimes taking 5 seconds (the DNS timeout before retry) to
be made.

Fixes #19144

Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0Like0Like0Like0Like0