Project

General

Profile

Bug #16672

net/http leaves original content-length header intact after inflating response

Added by jmreid (Justin Reid) 9 months ago. Updated 9 months ago.

Status:
Open
Priority:
Normal
Assignee:
-
Target version:
-
ruby -v:
ruby 2.6.5p114 (2019-10-01 revision 67812) [x86_64-darwin19]
[ruby-core:97359]

Description

When using net/http to make a request to a resource, the default request headers are the following (when you have ZLIB available):
"accept-encoding"=>["gzip;q=1.0,deflate;q=0.6,identity;q=0.3"], "accept"=>["*/*"], "user-agent"=>["Ruby"]

This means that a resource will return a gzipped response if it can provide it. Take this URL for example:
https://storage.googleapis.com/justin-reid-test/test.js

This is a JS file that has a content-length of 2733 when gzipped and 9995 when inflated:

curl "https://storage.googleapis.com/justin-reid-test/test.js" -H "accept-encoding: gzip;q=1.0,deflate;q=0.6,identity;q=0.3" | wc -c
2733

curl "https://storage.googleapis.com/justin-reid-test/test.js" | wc -c
9995

When making a simple request for this asset using net/http:

uri = URI('https://storage.googleapis.com/justin-reid-test/test.js')
res = Net::HTTP.get_response(uri)

Ruby will (https://github.com/ruby/ruby/blob/f08cd708b11dd5b293986b92bb5e227731665b36/lib/net/http/response.rb#L264-L278):

  • Delete the content-encoding header
  • inflate the body
  • return the inflated body

The issue here is that Ruby also leaves the content-length header set to the original request's value:

require 'net/http'

uri = URI('https://storage.googleapis.com/justin-reid-test/test.js')
res = Net::HTTP.get_response(uri)

puts "Fetching: https://storage.googleapis.com/justin-reid-test/test.js"
puts "Body size using String#bytesize: #{res.body.to_s.bytesize}"
puts "Content-Length response header: #{res.content_length}"

Results in:

Fetching: https://storage.googleapis.com/justin-reid-test/test.js
Body size using String#bytesize: 9995
Content-Length response header: 2733

This means that an incorrect content-length header is passed back when net/http makes requests for gzip objects and inflates them.

This issue was noticed when Rack changed their behaviour in how they compute content-length. They used to compute the content-length for each body, but that changed in 2.0.8:
https://github.com/rack/rack/commit/8c62821f4a464858a6b6ca3c3966ec308d2bb53e#diff-10b933d2c1fdc82ceecade456c64e1c2L92
https://github.com/rack/rack/issues/1472#issuecomment-574362342

Using Rack::ContentLength is now the method they prefer if you need to compute the content-length. However, Rack::ContentLength will not try to re-compute the value if that header already exists:
https://github.com/rack/rack/blob/6196377654b7ff7ce7abaecea62bb285d77d53aa/lib/rack/content_length.rb#L21

Should Ruby:

Also available in: Atom PDF