Bug #19017
openNet::HTTP may block when attempting to reuse a persistent connection
Description
Ruby's Net::HTTP code performs a blocking Net::BufferedIO#eof?
check when attempting to reuse a persistent HTTP connection. See https://github.com/ruby/ruby/blob/6b099328af2ae2d04cbfd06fedc36a19cdecd30d/lib/net/http.rb#L1573. The bug is the check can hang for up to the HTTP read_timeout
, which is 60 seconds by default.
The code calls TCPSocket#wait_readable(0)
to see if the socket is readable before calling the blocking eof?
method. However, it's possible for the socket to be readable with SSL Handshake records and no Application Data. So the call to eof?
will process the SSL Handshake records, but hang since no Application Data is available.
The issue can be triggered if a TLS 1.3 server sends a NewSessionTicket
sometime after Application Data is written. The attached client and server code demonstrate the problem. Note it's important that the client and server be on separate hosts otherwise eof?
will always return immediately.
On the server, copy Server.java
and certs.p12
into a directory, install JDK 17, compile the server and run it:
$ openssl pkcs12 -info -in certs.p12 -noout -passin pass:password
MAC: sha1, Iteration 2048
MAC length: 20, salt length: 8
PKCS7 Encrypted data: pbeWithSHA1And40BitRC2-CBC, Iteration 2048
Certificate bag
PKCS7 Data
Shrouded Keybag: pbeWithSHA1And3-KeyTripleDES-CBC, Iteration 2048
$ sudo apt install -y openjdk-17-jdk openjdk-17-jre
$ javac Server.java
$ java -Djavax.net.debug=ssl,verbose Server
Loaded pkcs12
On the client, copy http.rb
and ca.pem
into a directory, add the IP address for the server as pluto
to /etc/hosts
:
$ file ca.pem
$ sudo vi /etc/hosts
...
192.168.0.10 pluto
...
$ ruby --version
ruby 2.7.6p219 (2022-04-12 revision c9c2245c0a) [x86_64-linux]
$ openssl version
OpenSSL 1.1.1f 31 Mar 2020
Run the client to make the first request:
$ ruby http.rb
opening connection to pluto:8888...
opened
starting SSL for pluto:8888...
The server will handle request_1 and trigger a new session ticket:
javax.net.ssl|DEBUG|10|main|2022-09-22 18:18:23.269 UTC|SSLCipher.java:466|jdk.tls.keyLimits: entry = AES/GCM/NoPadding KeyUpdate 2^37. AES/GCM/NOPADDING:KEYUPDATE = 137438953472
Connected to 37532
Handling request_0
... snip ...
javax.net.ssl|DEBUG|10|main|2022-09-22 18:18:25.310 UTC|SSLCipher.java:2024|KeyLimit write side: algorithm = AES/GCM/NOPADDING:KEYUPDATE
countdown value = 137438953472
javax.net.ssl|DEBUG|10|main|2022-09-22 18:18:25.335 UTC|SSLCipher.java:1870|KeyLimit read side: algorithm = AES/GCM/NOPADDING:KEYUPDATE
countdown value = 137438953472
read body
updated session data
javax.net.ssl|ALL|10|main|2022-09-22 18:18:25.343 UTC|SSLSocketImpl.java:1564|trigger new session ticket
wrote response
Handling request_1
The client will hang when trying to reuse the persistent connection:
OSSL_DEBUG: SSL SESSION new callback added [ossl_ssl.c:963]
SSL established, protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384
<- "POST / HTTP/1.1\r\nAccept-Encoding: identity\r\nConnection: keep-alive\r\nContent-Type: text/plain\r\nAccept: */*\r\nUser-Agent: Ruby\r\nHost: pluto:8888\r\nContent-Length: 0\r\n\r\n"
<- ""
OSSL_DEBUG: SSL SESSION new callback entered [ossl_ssl.c:454]
-> "HTTP/1.1 200 OK\r\n"
-> "Content-Length: 0\r\n"
-> "\r\n"
reading 0 bytes...
-> ""
read 0 bytes
Conn keep-alive
OSSL_DEBUG: SSL SESSION new callback entered [ossl_ssl.c:454]
Pressing Ctrl-C shows the backtrace:
^CTraceback (most recent call last):
11: from http.rb:10:in `<main>'
10: from /home/josh/.rbenv/versions/2.7.6/lib/ruby/2.7.0/net/http.rb:933:in `start'
9: from http.rb:17:in `block in <main>'
8: from /home/josh/.rbenv/versions/2.7.6/lib/ruby/2.7.0/net/http.rb:1294:in `post'
7: from /home/josh/.rbenv/versions/2.7.6/lib/ruby/2.7.0/net/http.rb:1506:in `send_entity'
6: from /home/josh/.rbenv/versions/2.7.6/lib/ruby/2.7.0/net/http.rb:1492:in `request'
5: from /home/josh/.rbenv/versions/2.7.6/lib/ruby/2.7.0/net/http.rb:1518:in `transport_request'
4: from /home/josh/.rbenv/versions/2.7.6/lib/ruby/2.7.0/net/http.rb:1573:in `begin_transport'
3: from /home/josh/.rbenv/versions/2.7.6/lib/ruby/2.7.0/net/protocol.rb:134:in `eof?'
2: from /home/josh/.rbenv/versions/2.7.6/lib/ruby/2.7.0/openssl/buffering.rb:300:in `eof?'
1: from /home/josh/.rbenv/versions/2.7.6/lib/ruby/2.7.0/openssl/buffering.rb:57:in `fill_rbuff'
/home/josh/.rbenv/versions/2.7.6/lib/ruby/2.7.0/openssl/buffering.rb:57:in `sysread': Interrupt
I get the same behavior with latest ruby too:
$ ruby --version
ruby 3.1.2p20 (2022-04-12 revision 4491bb740a) [x86_64-linux]
Changing Net::HTTP to the following:
elsif @socket.io.read_nonblock(0, exception: false).nil?
Resolves the issue:
$ ruby http.rb
opening connection to pluto:8888...
opened
starting SSL for pluto:8888...
OSSL_DEBUG: SSL SESSION new callback added [ossl_ssl.c:963]
SSL established, protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384
<- "POST / HTTP/1.1\r\nAccept-Encoding: identity\r\nConnection: keep-alive\r\nContent-Type: text/plain\r\nAccept: */*\r\nUser-Agent: Ruby\r\nHost: pluto:8888\r\nContent-Length: 0\r\n\r\n"
<- ""
OSSL_DEBUG: SSL SESSION new callback entered [ossl_ssl.c:454]
-> "HTTP/1.1 200 OK\r\n"
-> "Content-Length: 0\r\n"
-> "\r\n"
reading 0 bytes...
-> ""
read 0 bytes
Conn keep-alive
<- "POST / HTTP/1.1\r\nAccept-Encoding: identity\r\nConnection: keep-alive\r\nContent-Type: text/plain\r\nAccept: */*\r\nUser-Agent: Ruby\r\nHost: pluto:8888\r\nContent-Length: 0\r\n\r\n"
<- ""
OSSL_DEBUG: SSL SESSION new callback entered [ossl_ssl.c:454]
-> "HTTP/1.1 200 OK\r\n"
-> "Content-Length: 0\r\n"
-> "\r\n"
reading 0 bytes...
-> ""
read 0 bytes
Conn keep-alive
However, based on https://github.com/ruby/ruby/pull/1089#issuecomment-159878003 that change may not be correct. Or it could be that Ruby on Windows doesn't have this issue anymore.
Files