Bug #9927

webrick does not unset content-length when responding to HEAD requests

Added by Adrien Thebo 11 months ago. Updated 11 months ago.

[ruby-core:63072]
Status:Rejected
Priority:Normal
Assignee:-
ruby -v:ruby 1.9.3p484 (2013-11-22 revision 43786) [x86_64-linux] Backport:2.0.0: UNKNOWN, 2.1: UNKNOWN

Description

When Webrick responds to HEAD requests it omits the body (per RFC2616 -- 4.4 Message Length). However when setting up the response headers the content-length field is set to the length of the body, which means that the resulting response will have a content length that doesn't match the actual response. This means that some HTTP clients may hang when reading the response.

This is reproducible with the following:

require 'webrick'

server = WEBrick::HTTPServer.new :Port => 8080, :BindAddress => '127.0.0.1'
server.mount_proc("/") do |req, res|
  res.body = "This will be ignored!\r\n"
end
trap('INT') do
  server.shutdown
end
server.start

Running this with curl results in the following:

└> ruby webrick-head.rb                 
[2014-06-10 12:07:28] INFO  WEBrick 1.3.1
[2014-06-10 12:07:28] INFO  ruby 1.9.3 (2013-11-22) [x86_64-linux]
[2014-06-10 12:07:28] INFO  WEBrick::HTTPServer#start: pid=24798 port=8080
localhost - - [10/Jun/2014:12:07:30 PDT] "HEAD / HTTP/1.1" 200 0
- -> /
^C[2014-06-10 12:07:36] INFO  going to shutdown ...
[2014-06-10 12:07:36] INFO  WEBrick::HTTPServer#start done.
└> curl -v -X HEAD http://localhost:8080
* Rebuilt URL to: http://localhost:8080/
* Hostname was NOT found in DNS cache
*   Trying 127.0.0.1...
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed

  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0* Connected to localhost (127.0.0.1) port 8080 (#0)
> HEAD / HTTP/1.1
> User-Agent: curl/7.37.0
> Host: localhost:8080
> Accept: */*
> 
< HTTP/1.1 200 OK 
* Server WEBrick/1.3.1 (Ruby/1.9.3/2013-11-22) is not blacklisted
< Server: WEBrick/1.3.1 (Ruby/1.9.3/2013-11-22)
< Date: Tue, 10 Jun 2014 19:07:30 GMT
< Content-Length: 23
< Connection: Keep-Alive
< 

  0    23    0     0    0     0      0      0 --:--:--  0:00:01 --:--:--     0
  0    23    0     0    0     0      0      0 --:--:--  0:00:02 --:--:--     0
  0    23    0     0    0     0      0      0 --:--:--  0:00:03 --:--:--     0
  0    23    0     0    0     0      0      0 --:--:--  0:00:04 --:--:--     0
  0    23    0     0    0     0      0      0 --:--:--  0:00:05 --:--:--     0
[Ctrl-C sent to server]
  0    23    0     0    0     0      0      0 --:--:--  0:00:06 --:--:--     0{ [data not shown]
* transfer closed with 23 bytes remaining to read

  0    23    0     0    0     0      0      0 --:--:--  0:00:06 --:--:--     0
* Closing connection 0
curl: (18) transfer closed with 23 bytes remaining to read

This is reasonably straightforward to fix; when the headers are being created and the code is checking to see if the body should be ignored for HTTP 204 and 304, we can check to see if we're responding to a HEAD request and behave accordingly. I've attached patches to this effect

0001-lib-webrick-httpresponse.rb-unset-content-length-hea.patch Magnifier (1.54 KB) Adrien Thebo, 06/10/2014 07:18 PM

History

#1 Updated by Adrien Thebo 11 months ago

I didn't make this very clear, but the curl invocation hangs for about 6 seconds until the webrick server is killed, and then prints 'transfer closed with 23 bytes remaining to read' which indicates it was hanging.

#2 Updated by Eric Hodel 11 months ago

  • Status changed from Open to Rejected

This appears to be a bug in curl.

RFC 7230 says:

3.3.  Message Body

   The message body (if any) of an HTTP message is used to carry the
   payload body of that request or response.  The message body is
   identical to the payload body unless a transfer coding has been
   applied, as described in Section 3.3.1.

     message-body = *OCTET

   The rules for when a message body is allowed in a message differ for
   requests and responses.

   The presence of a message body in a request is signaled by a
   Content-Length or Transfer-Encoding header field.  Request message
   framing is independent of method semantics, even if the method does
   not define any use for a message body.

   The presence of a message body in a response depends on both the
   request method to which it is responding and the response status code
   (Section 3.1.2).  Responses to the HEAD request method (Section 4.3.2
   of [RFC7231]) never include a message body because the associated
   response header fields (e.g., Transfer-Encoding, Content-Length,
   etc.), if present, indicate only what their values would have been if
   the request method had been GET (Section 4.3.1 of [RFC7231]). 2xx
   (Successful) responses to a CONNECT request method (Section 4.3.6 of
   [RFC7231]) switch to tunnel mode instead of having a message body.
   All 1xx (Informational), 204 (No Content), and 304 (Not Modified)
   responses do not include a message body.  All other responses do
   include a message body, although the body might be of zero length.

RFC 2616 says the same using different words in 4.4:

   1.Any response message which "MUST NOT" include a message-body (such
     as the 1xx, 204, and 304 responses and any response to a HEAD
     request) is always terminated by the first empty line after the
     header fields, regardless of the entity-header fields present in
     the message.

#3 Updated by Matthew Kerwin 11 months ago

Adrien Thebo wrote:

I didn't make this very clear, but the curl invocation hangs for about 6 seconds until the webrick server is killed, and then prints 'transfer closed with 23 bytes remaining to read' which indicates it was hanging.

This is a problem with the way you're using curl. Webrick is doing the right thing according to the HTTP spec[1], which states that "the payload header fields [including content-length] MAY be omitted." There is no obligation on the server to do so.

You have to tell your client (curl) not to wait for a message body. -X HEAD just changes bytes omitted in the request, but doesn't change the client behaviour; you have to use -I to signal that this is a metadata-only fetch [2]

[1] http://tools.ietf.org/html/rfc7231#section-4.3.2
[2] http://linux.die.net/man/1/curl

#4 Updated by Adrien Thebo 11 months ago

My error, sorry for the distraction.

#5 Updated by Nobuyoshi Nakada 10 months ago

  • Duplicated by Bug #9986: WEBrick content-length being set when transfer-encoding is chunked added

#6 Updated by Yui NARUSE 10 months ago

  • Duplicated by deleted (Bug #9986: WEBrick content-length being set when transfer-encoding is chunked)

Also available in: Atom PDF