Bug #9990

URI.parse and URI.encode use different RFCs

Added by Leonard Garvey about 1 year ago. Updated 12 months ago.

[ruby-core:63390]
Status:Assigned
Priority:Normal
Assignee:Yui NARUSE
ruby -v:2.2.0dev Backport:2.0.0: UNKNOWN, 2.1: UNKNOWN

Description

The latest code for URI.parse uses RFC3986 but URI.encode/URI.escape still uses the old URI::RFC2396_Parser implementation of encode. This causes problems when the specs diverge.

In RFC3986 square brackets "[" and "]" are reserved and need to be percent encoded in the query string, although they didn't in RFC2396. This means the the following url cannot be parsed by the new parser, and isn't encoded correctly by the old encoder: https://bugs.ruby-lang.org/projects/ruby-trunk/issues?set_filter=1&f[]=status_id&op[status_id]=o

Here's a quick ruby script which demonstrates the issue on 2.2.0dev:

url = "https://bugs.ruby-lang.org/projects/ruby-trunk/issues?set_filter=1&f[]=status_id&op[status_id]=o"
puts URI.encode(url)
URI.parse(URI.encode(url))

The output of running this script can be seen at: https://gist.github.com/lengarvey/c1d17913f9ea95fd999c

I believe a new encoder needs to be written up according to the RFC3986 spec and this should be used as the default in URI.


Related issues

Related to Ruby trunk - Feature #2542: URI lib should be updated to RFC 3986 Closed 01/01/2010

History

#1 Updated by Zachary Scott about 1 year ago

  • Status changed from Open to Feedback

Did you see r46491?

#2 Updated by Leonard Garvey about 1 year ago

I did see it, this bug points out that URI.escape isn't covered by that change. I'm not sure if there's a more appropriate place for that feedback besides raising this issue though. This is a separate issue to the one raised by @tenderlove in #2542 though. Aaron seems to be saying that URI.parse has changed semantics significantly and breaks existing code, this issue is demonstrate that there exists no way to properly encode a URI so that URI.parse will accept it.

#3 Updated by Tomoyuki Chikanaga 12 months ago

  • Status changed from Feedback to Assigned

#4 Updated by Nobuyoshi Nakada 10 months ago

  • Related to Feature #2542: URI lib should be updated to RFC 3986 added

Also available in: Atom PDF