Project

General

Profile

Actions

Bug #18791

closed

Unexpected behavior of Socket#connect_nonblock

Added by midnight (Sarun R) about 1 month ago. Updated about 1 month ago.

Status:
Rejected
Priority:
Normal
Assignee:
-
Target version:
-
ruby -v:
ruby 3.1.1p18 (2022-02-18 revision 53f5fc4236) [x86_64-linux]
[ruby-core:108626]

Description

I followed an example of Socket#connect_nonblock on the document.
Waiting for multiple Socket connections at once.

# Pull down Google's web page
require 'socket'
include Socket::Constants
socket = Socket.new(AF_INET, SOCK_STREAM, 0)
sockaddr = Socket.sockaddr_in(80, 'www.google.com')
begin # emulate blocking connect
  socket.connect_nonblock(sockaddr)
rescue IO::WaitWritable
  IO.select(nil, [socket]) # wait 3-way handshake completion
  begin
    # ****************** The problem is here ********************
    socket.connect_nonblock(sockaddr) # check connection failure
  rescue Errno::EISCONN
  end
end
socket.write("GET / HTTP/1.0\r\n\r\n")
results = socket.read

The first call to connect_nonblock raises IO::WaitWritable as expected.
But the confirmation call did not raise Errno::EISCONN; instead, it returned 0.

Upon source code inspection, 0 was returned from connect and is supposed to mean success.

static VALUE
sock_connect_nonblock(VALUE sock, VALUE addr, VALUE ex)
{
    VALUE rai;
    rb_io_t *fptr;
    int n;

    SockAddrStringValueWithAddrinfo(addr, rai);
    addr = rb_str_new4(addr);
    GetOpenFile(sock, fptr);
    rb_io_set_nonblock(fptr);
    n = connect(fptr->fd, (struct sockaddr*)RSTRING_PTR(addr), RSTRING_SOCKLEN(addr));
    if (n < 0) {
	int e = errno;
	if (e == EINPROGRESS) {
            if (ex == Qfalse) {
                return sym_wait_writable;
            }
            rb_readwrite_syserr_fail(RB_IO_WAIT_WRITABLE, e, "connect(2) would block");
	}
	if (e == EISCONN) {
            if (ex == Qfalse) {
                return INT2FIX(0);
            }
	}
	rsock_syserr_fail_raddrinfo_or_sockaddr(e, "connect(2)", addr, rai);
    }

    return INT2FIX(n);
}

I made sure ex is true, so it is not return INT2FIX(0); that get returned but the last statement return INT2FIX(n);.
Is this the intended behavior? It does surprise me and clearly doesn't explain very well in the document.
The example shown implied that the only way to confirm the connection is Errno::EISCONN; never mention the return code.

Updated by midnight (Sarun R) about 1 month ago

Should I treat 0 as a successful connection?
Does anyone have an insight on this?
Does anyone want me to try something on my side?

Updated by jeremyevans0 (Jeremy Evans) about 1 month ago

  • Status changed from Open to Rejected

This doesn't look like a bug. If the connect system call returns 0, then you were able to connect, and connect_nonblock in Ruby returns 0 as well. Yes, you should treat 0 as indicating the connection was successful. The documentation states Returns 0 if successful, otherwise an exception is raised. I think that explains the behavior correctly, so this does not appear to be a documentation bug either.

I think maybe you are expecting Errno::EISCONN must be raised to indicate a successful connection. That would also indicate a successful connection, but having the method return 0 also indicates the connection was successful. The example code is written so either returning 0 or raising Errno::EISCONN is treated as a successful connection, so the example code does not appear to be incorrect.

Updated by midnight (Sarun R) about 1 month ago

@jeremyevans0 (Jeremy Evans)
The example is overly simplified, and it does lead to confusion in practice when you put it in context.

Realistically speaking, no one would use a single value in IO.select; at least, in the worst-case scenario.
So the return value of IO.select doesn't get discarded like in the example; the connect_nonblock would generically run in a loop, not a one-off attempt.

This is the relevant part of an IRL Ruby method implementing a POC version of RFC8305:

selectable = []
elapse = 0.0
socket_map = {}
addresses.each do |address|
  socket_domain, resolved_address = address.values_at(0, 3)
  socket = ::Socket.new(socket_domain, :STREAM, 0)
  sock_address = ::Socket.pack_sockaddr_in(port, resolved_address)
  socket_map[socket] = sock_address
  begin
    selectable << socket
    socket.connect_nonblock(sock_address)
  rescue ::IO::WaitWritable
    ready_connections = ::IO.select(nil, selectable, nil, @connection_attempt_delay)
    Array(ready_connections).flatten(1).compact.each do |test_connection|
      test_address = socket_map[test_connection]
      raise AddressResolutionError unless test_address
      test_connection.connect_nonblock(test_address)
    rescue Errno::EISCONN
      return connected_handler.call(test_connection)
    end
    elapse += @connection_attempt_delay
    raise ConnectionTimeoutError if @connection_timeout < elapse
  end
end

It is not runnable by itself but you'll get the idea.
The code surprised me by always making at least two connection attempts without a connection issue on any of the addresses.

To be clear the example did nothing wrong in isolating but does exploit something unrealistic in practice.
The example discards the return value of IO.select and lets the code fall through to reach socket.write, disregarding the returned value.
When we don't let the code fall through, it does appear to be incorrect to me.

Updated by midnight (Sarun R) about 1 month ago

The document does mention the returned value of 0, but I'm not sure if I should also check for EISCONN by calling it one more time in such a case.

Actions

Also available in: Atom PDF