https://bugs.ruby-lang.org/https://bugs.ruby-lang.org/favicon.ico?17113305112020-03-04T22:05:05ZRuby Issue Tracking SystemRuby master - Bug #16672: net/http leaves original content-length header intact after inflating responsehttps://bugs.ruby-lang.org/issues/16672?journal_id=844862020-03-04T22:05:05Zjmreid (Justin Reid)
<ul><li><strong>Description</strong> updated (<a title="View differences" href="/journals/84486/diff?detail_id=56523">diff</a>)</li></ul> Ruby master - Bug #16672: net/http leaves original content-length header intact after inflating responsehttps://bugs.ruby-lang.org/issues/16672?journal_id=844872020-03-04T22:25:04Zioquatix (Samuel Williams)samuel@oriontransfer.net
<ul></ul><p>Are you using <code>res.content_length</code>? Why can't you just use <code>res.body.to_s.bytesize</code>?</p> Ruby master - Bug #16672: net/http leaves original content-length header intact after inflating responsehttps://bugs.ruby-lang.org/issues/16672?journal_id=844902020-03-05T01:40:14Zjmreid (Justin Reid)
<ul></ul><p>ioquatix (Samuel Williams) wrote in <a href="#note-2">#note-2</a>:</p>
<blockquote>
<p>Are you using <code>res.content_length</code>? Why can't you just use <code>res.body.to_s.bytesize</code>?</p>
</blockquote>
<p><code>res.content_length</code> is just the <code>content-length</code> response header from the net/http request:</p>
<pre><code>uri = URI('https://storage.googleapis.com/justin-reid-test/test.js')
res = Net::HTTP.get_response(uri)
res.to_hash
</code></pre>
<pre><code>{"x-guploader-uploadid"=>["AEnB2UrJSkEHDPO5iCRug2Qp0bzweRd8pd05d5eRq0fIUtRVZuibaGfQZhLHBN58g0ZW1N-qMcsUiZACutpDoObHTijYs3_DrUNOy8SH9HA1hhTW0RtIbco"],
"date"=>["Thu, 05 Mar 2020 01:35:08 GMT"],
"expires"=>["Fri, 05 Mar 2021 01:35:08 GMT"],
"last-modified"=>["Wed, 04 Mar 2020 20:53:40 GMT"],
"etag"=>["\"e5994ce974ae1b8e426810037812e7d5\""],
"x-goog-generation"=>["1583355220705723"],
"x-goog-metageneration"=>["2"],
"x-goog-stored-content-encoding"=>["gzip"],
"x-goog-stored-content-length"=>["2733"],
"content-type"=>["application/javascript; charset=utf-8"],
"x-goog-hash"=>["crc32c=VCx7Dg==", "md5=5ZlM6XSuG45CaBADeBLn1Q=="],
"x-goog-storage-class"=>["STANDARD"],
"accept-ranges"=>["bytes"],
"vary"=>["Accept-Encoding"],
"content-length"=>["2733"],
"server"=>["UploadServer"],
"cache-control"=>["public, max-age=31536000, immutable"],
"age"=>["6"],
"alt-svc"=>["quic=\":443\"; ma=2592000; v=\"46,43\",h3-Q050=\":443\"; ma=2592000,h3-Q049=\":443\"; ma=2592000,h3-Q048=\":443\"; ma=2592000,h3-Q046=\":443\"; ma=2592000,h3-Q043=\":443\"; ma=2592000"]}
</code></pre>
<p>The issue here is that this <code> "content-length"=>["2733"]</code> value is 2733. The <code>res.body</code> at this point is 9995, so <code>content-length</code> needs to match that or not exist. Otherwise it causes browsers to only partially download the file.</p> Ruby master - Bug #16672: net/http leaves original content-length header intact after inflating responsehttps://bugs.ruby-lang.org/issues/16672?journal_id=844912020-03-05T02:19:02Zjeremyevans0 (Jeremy Evans)merch-redmine@jeremyevans.net
<ul></ul><p>jmreid (Justin Reid) wrote in <a href="#note-3">#note-3</a>:</p>
<blockquote>
<p>The issue here is that this <code> "content-length"=>["2733"]</code> value is 2733. The <code>res.body</code> at this point is 9995, so <code>content-length</code> needs to match that or not exist. Otherwise it causes browsers to only partially download the file.</p>
</blockquote>
<p>I don't think I agree with that logic. <code>content_length</code> is documented to return the value of the header, not the size of the decompressed body. So the method appears to be operating exactly as documented.</p>
<p>I'm also not sure why you are mentioning browsers in this context. If I open up Chrome and go into Development Tools, when I request <code>https://storage.googleapis.com/justin-reid-test/test.js</code> and look at the response headers, I see 2733 for <code>Content-Length</code>, not 9995.</p>
<p>The only possible interpretation I can think of where browsers come into play is if you were writing a server, using net/http inside a request handler, and taking the response status, headers, and body returned by net/http and returning them directly as the response. In which case the proper fix would be not decompressing the body automatically:</p>
<pre><code class="ruby syntaxhl" data-language="ruby"><span class="n">res</span> <span class="o">=</span> <span class="no">Net</span><span class="o">::</span><span class="no">HTTP</span><span class="p">.</span><span class="nf">get_response</span><span class="p">(</span><span class="n">uri</span><span class="p">){</span><span class="o">|</span><span class="n">res</span><span class="o">|</span> <span class="n">res</span><span class="p">.</span><span class="nf">decode_content</span> <span class="o">=</span> <span class="kp">false</span><span class="p">}</span>
</code></pre> Ruby master - Bug #16672: net/http leaves original content-length header intact after inflating responsehttps://bugs.ruby-lang.org/issues/16672?journal_id=844922020-03-05T02:56:42Zjmreid (Justin Reid)
<ul></ul><blockquote>
<p>So the method appears to be operating exactly as documented.</p>
</blockquote>
<p>I'm not saying that method isn't working as intended. It's working as intended and I'm just using it to show the size of the header for the request net/http made. My comment saying that <code>content-length</code> needs to match 9995 meant: The <code>Content-Length</code> header that net/http returns needs to actually match the content length of the body for that request.</p>
<blockquote>
<p>If I open up Chrome and go into Development Tools, when I request <a href="https://storage.googleapis.com/justin-reid-test/test.js" class="external">https://storage.googleapis.com/justin-reid-test/test.js</a> and look at the response headers, I see 2733 for Content-Length, not 9995.</p>
</blockquote>
<p>This is because Chrome reports the size of the file over the network and not the decompressed size in dev tools. If you find the resource in the network panel and hover over the "size" column, you'll see that "resource size" is the uncompressed size.</p>
<p>It's best to use curl to test these URLs.</p>
<p>My core concern is that Ruby shouldn't be leaving an incorrect header value around after it mutates the response. I can't see a reason why leaving <code>Content-Length: 2733</code> makes sense when the actual body is <code>9995</code>.</p>
<blockquote>
<p>The only possible interpretation I can think of where browsers come into play is if you were writing a server, using net/http inside a request handler, and taking the response status, headers, and body returned by net/http and returning them directly as the response. In which case the proper fix would be not decompressing the body automatically.</p>
</blockquote>
<p>There are valid reasons to let net/http inflate the body. Just disabling gzip inflation because Ruby passes incorrect values in a mutated response doesn't seem like a proper fix.</p> Ruby master - Bug #16672: net/http leaves original content-length header intact after inflating responsehttps://bugs.ruby-lang.org/issues/16672?journal_id=844932020-03-05T04:04:06Zjeremyevans0 (Jeremy Evans)merch-redmine@jeremyevans.net
<ul></ul><p>jmreid (Justin Reid) wrote in <a href="#note-5">#note-5</a>:</p>
<blockquote>
<blockquote>
<p>So the method appears to be operating exactly as documented.</p>
</blockquote>
<p>I'm not saying that method isn't working as intended. It's working as intended and I'm just using it to show the size of the header for the request net/http made. My comment saying that <code>content-length</code> needs to match 9995 meant: The <code>Content-Length</code> header that net/http returns needs to actually match the content length of the body for that request.</p>
</blockquote>
<p>I think I understand your reasoning better now. net/http is currently deleting the <code>Content-Encoding</code> header, so deleting or modifying the <code>Content-Length</code> header makes sense in that light. Modifying the <code>Content-Length</code> header is tricky because you do not know the decoded size until after decoding. Anyway, the current implementation, by deleting <code>Content-Encoding</code> and not changing <code>Content-Length</code>, is inconsistent.</p>
<p>I don't think net/http should be modifying the response headers. I think the response headers should remain exactly as sent by the server.</p>
<p>The deletion of the <code>Content-Encoding</code> header has been present since the initial support was merged in <a class="changeset" title="* lib/net/http.rb (Net::HTTP::get): now supports gzip content-encoding. a patch from Hugh Sass..." href="https://bugs.ruby-lang.org/projects/ruby-master/repository/git/revisions/ff7f462bf49a1199b1657de6a73a0dc91deae1fa">ff7f462bf49a1199b1657de6a73a0dc91deae1fa</a> back in 2007. My guess as to why is so that existing callers that decoded bodies based on the value of the <code>Content-Encoding</code> header would not attempt to decode an already decoded body after the support was merged. And that does seem a reasonable way of keeping backwards compatibility while adding transparent decoding of bodies. I'm not sure how much code still exists that uses net/http and manually decodes bodies based on the <code>Content-Encoding</code> header, but maybe we can consider whether to remove the automatic deletion of the <code>Content-Encoding</code> header when decoding, possibly indicating whether decoding happened using a separate method.</p> Ruby master - Bug #16672: net/http leaves original content-length header intact after inflating responsehttps://bugs.ruby-lang.org/issues/16672?journal_id=844942020-03-05T04:42:29Zjmreid (Justin Reid)
<ul></ul><p>jeremyevans0 (Jeremy Evans) wrote in <a href="#note-6">#note-6</a>:</p>
<blockquote>
<p>I don't think net/http should be modifying the response headers. I think the response headers should remain exactly as sent by the server.</p>
</blockquote>
<p>I do agree in general. However, in this case net/http is actually mutating the response itself. To me that feels like a time when modifying the headers <em>should</em> happen as well to match those changes.</p>
<p>Deleting <code>Content-Encoding</code> makes sense because the encoding is no longer <code>gzip</code>. Same logic should be applied to <code>Content-Length</code>, in my opinion.</p>
<p>I do understand that net/http shouldn't be modifying headers for just any reason, but again, changing the actual response contents necessitates changing these two headers.</p>
<blockquote>
<p>maybe we can consider whether to remove the automatic deletion of the Content-Encoding header when decoding, possibly indicating whether decoding happened using a separate method.</p>
</blockquote>
<p>This is definitely another option, but feels like it might set people up for more misunderstanding. If net/http inflates the body but leaves <code>Content-Encoding: gzip</code>, that opens the door for even more confusion in my opinion.</p>
<blockquote>
<p>Modifying the Content-Length header is tricky because you do not know the decoded size until after decoding</p>
</blockquote>
<p>This is true, but <code>Zlib::Inflate</code> does have a <code>total_out</code> method that could be used (just a hacky patch I made while testing this locally):<br>
<a href="https://github.com/ruby/ruby/compare/master...jmreid:content-length" class="external">https://github.com/ruby/ruby/compare/master...jmreid:content-length</a></p> Ruby master - Bug #16672: net/http leaves original content-length header intact after inflating responsehttps://bugs.ruby-lang.org/issues/16672?journal_id=844952020-03-05T05:54:55Zjeremyevans0 (Jeremy Evans)merch-redmine@jeremyevans.net
<ul></ul><p>jmreid (Justin Reid) wrote in <a href="#note-7">#note-7</a>:</p>
<blockquote>
<blockquote>
<p>Modifying the Content-Length header is tricky because you do not know the decoded size until after decoding</p>
</blockquote>
<p>This is true, but <code>Zlib::Inflate</code> does have a <code>total_out</code> method that could be used (just a hacky patch I made while testing this locally):<br>
<a href="https://github.com/ruby/ruby/compare/master...jmreid:content-length" class="external">https://github.com/ruby/ruby/compare/master...jmreid:content-length</a></p>
</blockquote>
<p><code>total_out</code> doesn't give you the full size of the output until after the input is fully processed. So if there is an exception or other early exit from the block passed to <code>inflater</code> before the body is fully inflated, you can end up with an incorrect result. Also, the <code>Content-Length</code> header inside the block would still be wrong. You could remove it before the block and only set it on success after the block, that's probably the best way to handle it if you want to modify it.</p>
<pre><code class="ruby syntaxhl" data-language="ruby"><span class="n">s</span> <span class="o">=</span> <span class="no">Zlib</span><span class="o">::</span><span class="no">Deflate</span><span class="p">.</span><span class="nf">deflate</span><span class="p">(</span><span class="s1">'a'</span><span class="o">*</span> <span class="mi">100</span><span class="p">)</span>
<span class="n">s</span><span class="p">.</span><span class="nf">bytesize</span> <span class="c1"># => 12</span>
<span class="n">io</span> <span class="o">=</span> <span class="no">Zlib</span><span class="o">::</span><span class="no">Inflate</span><span class="p">.</span><span class="nf">new</span>
<span class="n">io</span><span class="p">.</span><span class="nf">total_out</span> <span class="c1"># => 0</span>
<span class="n">io</span> <span class="o"><<</span> <span class="n">s</span><span class="p">.</span><span class="nf">byteslice</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span><span class="mi">6</span><span class="p">)</span>
<span class="n">io</span><span class="p">.</span><span class="nf">total_out</span> <span class="c1"># => 2</span>
<span class="n">io</span> <span class="o"><<</span> <span class="n">s</span><span class="p">.</span><span class="nf">byteslice</span><span class="p">(</span><span class="mi">6</span><span class="p">,</span><span class="mi">7</span><span class="p">)</span>
<span class="n">io</span><span class="p">.</span><span class="nf">total_out</span> <span class="c1"># => 100</span>
</code></pre> Ruby master - Bug #16672: net/http leaves original content-length header intact after inflating responsehttps://bugs.ruby-lang.org/issues/16672?journal_id=844962020-03-05T13:59:48Zjmreid (Justin Reid)
<ul></ul><p>jeremyevans0 (Jeremy Evans) wrote in <a href="#note-8">#note-8</a>:</p>
<blockquote>
<p><code>total_out</code> doesn't give you the full size of the output until after the input is fully processed. So if there is an exception or other early exit from the block passed to <code>inflater</code> before the body is fully inflated, you can end up with an incorrect result. Also, the <code>Content-Length</code> header inside the block would still be wrong. You could remove it before the block and only set it on success after the block, that's probably the best way to handle it if you want to modify it.</p>
</blockquote>
<p>Ah, understood!</p> Ruby master - Bug #16672: net/http leaves original content-length header intact after inflating responsehttps://bugs.ruby-lang.org/issues/16672?journal_id=906722021-03-01T21:59:36Zjeremyevans0 (Jeremy Evans)merch-redmine@jeremyevans.net
<ul><li><strong>Status</strong> changed from <i>Open</i> to <i>Assigned</i></li><li><strong>Assignee</strong> set to <i>naruse (Yui NARUSE)</i></li></ul><p>After some time to think about this, I agree with <a class="user active user-mention" href="https://bugs.ruby-lang.org/users/16316">@jmreid (Justin Reid)</a> that updating the Content-Length is the simplest way to address this issue. I've added a pull request that implements this change: <a href="https://github.com/ruby/net-http/pull/16" class="external">https://github.com/ruby/net-http/pull/16</a></p> Ruby master - Bug #16672: net/http leaves original content-length header intact after inflating responsehttps://bugs.ruby-lang.org/issues/16672?journal_id=907232021-03-02T22:01:30ZEregon (Benoit Daloze)
<ul></ul><p>Looks good to me.<br>
I think it's important to still remove the <code>Content-Encoding</code> headers, otherwise a caller of net/http might inflate again (notably RubyGems had code related to that until recently).</p> Ruby master - Bug #16672: net/http leaves original content-length header intact after inflating responsehttps://bugs.ruby-lang.org/issues/16672?journal_id=971332022-04-01T17:49:50Zjeremyevans (Jeremy Evans)code@jeremyevans.net
<ul><li><strong>Status</strong> changed from <i>Assigned</i> to <i>Closed</i></li></ul><p>Applied in changeset <a class="changeset" title="[ruby/net-http] Update the content-length heading when decoding bodies Previously, the content-e..." href="https://bugs.ruby-lang.org/projects/ruby-master/repository/git/revisions/58adb1636be32fb95173f01e448673dbae4511b0">git|58adb1636be32fb95173f01e448673dbae4511b0</a>.</p>
<hr>
<p>[ruby/net-http] Update the content-length heading when decoding bodies</p>
<p>Previously, the content-encoding header was removed and the body<br>
was modified, but the content-length header was not modified,<br>
resulting in the content-length header not matching the body<br>
length.</p>
<p>Fixes [Bug <a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: net/http leaves original content-length header intact after inflating response (Closed)" href="https://bugs.ruby-lang.org/issues/16672">#16672</a>]</p>
<p><a href="https://github.com/ruby/net-http/commit/a7cb30124c" class="external">https://github.com/ruby/net-http/commit/a7cb30124c</a></p> Ruby master - Bug #16672: net/http leaves original content-length header intact after inflating responsehttps://bugs.ruby-lang.org/issues/16672?journal_id=971352022-04-02T19:59:35ZMSP-Greg (Greg L)
<ul></ul><p>I posted a comment in the commit, see <a href="https://github.com/ruby/ruby/commit/58adb1636be32fb95173f01e448673dbae4511b0#commitcomment-70298450" class="external">https://github.com/ruby/ruby/commit/58adb1636be32fb95173f01e448673dbae4511b0#commitcomment-70298450</a></p>
<p>Noticed failures in another repo this morning, happened in Rubygems/Bundler operations, and only with Ruby master. Tried some Net::HTTP things locally on both Windows ucrt and WSL2/Ubuntu, and they failed. Simple repo:</p>
<pre><code>ruby -ropen-uri -e "puts URI.parse('http://raw.githubusercontent.com/ruby/ruby/master/version.h').read"
</code></pre>
<p>Reverting the change, and it works as expected... Haven't had time to review the comments here.</p> Ruby master - Bug #16672: net/http leaves original content-length header intact after inflating responsehttps://bugs.ruby-lang.org/issues/16672?journal_id=971362022-04-02T20:34:37Zst0012 (Stan Lo)stan001212@gmail.com
<ul></ul><p><a class="user active user-mention" href="https://bugs.ruby-lang.org/users/11129">@MSP-Greg (Greg L)</a> I've also noticed this today and reported it in the net/http repo: <a href="https://github.com/ruby/net-http/issues/49" class="external">https://github.com/ruby/net-http/issues/49</a><br>
And also proposed a fix for it: <a href="https://github.com/ruby/net-http/pull/50" class="external">https://github.com/ruby/net-http/pull/50</a></p> Ruby master - Bug #16672: net/http leaves original content-length header intact after inflating responsehttps://bugs.ruby-lang.org/issues/16672?journal_id=971372022-04-02T21:52:58Zjeremyevans0 (Jeremy Evans)merch-redmine@jeremyevans.net
<ul><li><strong>Status</strong> changed from <i>Closed</i> to <i>Open</i></li></ul><p>I reverted the commit for now. I'll try testing the proposed fix in the coming week.</p> Ruby master - Bug #16672: net/http leaves original content-length header intact after inflating responsehttps://bugs.ruby-lang.org/issues/16672?journal_id=971382022-04-02T22:34:27Zst0012 (Stan Lo)stan001212@gmail.com
<ul></ul><p>That's great, thank you!</p> Ruby master - Bug #16672: net/http leaves original content-length header intact after inflating responsehttps://bugs.ruby-lang.org/issues/16672?journal_id=972082022-04-13T17:41:35Zjeremyevans (Jeremy Evans)code@jeremyevans.net
<ul><li><strong>Status</strong> changed from <i>Open</i> to <i>Closed</i></li></ul><p>Applied in changeset <a class="changeset" title="[ruby/net-http] Update the content-length heading when decoding bodies Previously, the content-e..." href="https://bugs.ruby-lang.org/projects/ruby-master/repository/git/revisions/0579486f154e80d17521494003dcd2499ef74688">git|0579486f154e80d17521494003dcd2499ef74688</a>.</p>
<hr>
<p>[ruby/net-http] Update the content-length heading when decoding bodies</p>
<p>Previously, the content-encoding header was removed and the body<br>
was modified, but the content-length header was not modified,<br>
resulting in the content-length header not matching the body<br>
length.</p>
<p>Don't delete content-length before yielding inflate body, as that<br>
causes a switch to read the entire body instead of reading in<br>
chunks.</p>
<p>Fixes [Bug <a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: net/http leaves original content-length header intact after inflating response (Closed)" href="https://bugs.ruby-lang.org/issues/16672">#16672</a>]</p>
<p><a href="https://github.com/ruby/net-http/commit/58284e9710" class="external">https://github.com/ruby/net-http/commit/58284e9710</a></p>
<p>Co-authored-by: st0012 <a href="mailto:stan001212@gmail.com" class="email">stan001212@gmail.com</a></p>