Project

General

Profile

Actions

Bug #15933

closed

OpenURI: Assign default charset for HTTPS as well as HTTP

Added by gareth (Gareth Adams) over 2 years ago. Updated over 2 years ago.

Status:
Closed
Priority:
Normal
Target version:
-
[ruby-core:93206]

Description

Using open-uri to load a document in the following circumstances:

  • The Content-Type header is text/* and doesn't specify a charset, e.g. Content-Type: text/csv
  • The document is loaded from an https:// URL

…will cause the resulting string to have ASCII-8BIT encoding.

As the documentation for OpenURI#charset mentions, RFC2616/3.7.1 says:

When no explicit charset parameter is provided by the sender, media subtypes of the "text" type are defined to have a default charset value of "ISO-8859-1" when received via HTTP.

OpenURI takes this literally - only assigning ISO-8859-1 if @base_uri.scheme is exactly "http". This check was written 17 years ago in 2002 even before TLS 1.1 was defined, and well before HTTPS was common.

I believe this check should now also match the scheme "https". As RFC2818/2 says:

Conceptually, HTTP/TLS is very simple. Simply use HTTP over TLS precisely as you would use HTTP over TCP

  1. Is this a suitable change to make?

  2. I have a patch to fix the functionality (attached). What else do I need to specify in terms of documentation/tests? I'm happy to put more work into this, but it's my first contribution to Ruby core and I'd like some pointers. I've read through https://bugs.ruby-lang.org/projects/ruby/wiki/HowToReport


Files

ruby-changes.patch (1.21 KB) ruby-changes.patch gareth (Gareth Adams), 06/17/2019 07:01 PM
ruby-changes.patch (3.05 KB) ruby-changes.patch gareth (Gareth Adams), 06/19/2019 11:42 AM
ruby-changes-combined.patch (2.24 KB) ruby-changes-combined.patch gareth (Gareth Adams), 06/27/2019 05:42 PM
Actions

Also available in: Atom PDF