Project

General

Profile

Actions

Feature #2542

closed

URI lib should be updated to RFC 3986

Added by marcandre (Marc-Andre Lafortune) about 14 years ago. Updated over 9 years ago.

Status:
Closed
Target version:
[ruby-core:27360]

Description

=begin
RFC 2396 has been obsolete for nearly 5 years now.

It was replaced by RFC 3986 which aims at clarifying aspects that were not previously clear.
=end


Related issues 5 (0 open5 closed)

Related to Ruby master - Bug #4110: ホスト名の先頭が数字であるとき、WEBrickのテストでErrorが出るClosednaruse (Yui NARUSE)12/02/2010Actions
Related to Ruby master - Bug #4673: URI::Generic registry is not properly set.Rejectednaruse (Yui NARUSE)Actions
Related to Ruby master - Bug #8352: URI squeezes a sequence of slashes in merging paths when it shouldn'tClosednaruse (Yui NARUSE)Actions
Related to Ruby master - Bug #10402: URI regression in 2.2.0-preview1 (bad URI(is not URI?): URI::InvalidURIError)Closednaruse (Yui NARUSE)Actions
Related to Ruby master - Bug #9990: URI.parse and URI.encode use different RFCsClosednaruse (Yui NARUSE)Actions
Actions #1

Updated by marcandre (Marc-Andre Lafortune) about 14 years ago

  • Subject changed from URI lib should be updated to RFC 39886 to URI lib should be updated to RFC 3986
Actions #3

Updated by duerst (Martin Dürst) about 14 years ago

No, RFC 3986 (URI) will NOT be updated. RFC 3987 (IRI), in due time,
will be updated. See also
http://www.ietf.org/ibin/c5i?mid=6&rid=49&gid=0&k1=934&k2=7294&tid=1262671757.

Regards, Martin.


#-# Martin J. Dürst, Professor, Aoyama Gakuin University
#-# http://www.sw.it.aoyama.ac.jp

Actions #4

Updated by naruse (Yui NARUSE) about 14 years ago

Ah, Martin is right, it's RFC 3987 and RFC 3986 will be still alive.

Anyway Bob Aman introduces an alternative library named Addressable.
http://addressable.rubyforge.org/
It looks good but some incompatibilities for current URI library.
I'll suggest to bundle Addressable and obsolete current URI lib,
but I have to plan its migration path.

Actions #5

Updated by mame (Yusuke Endoh) about 14 years ago

  • Target version changed from 1.9.2 to 2.0.0

Hi,

I'll suggest to bundle Addressable and obsolete current URI lib,
but I have to plan its migration path.

This ticket seems to need much work.
I guess we can't make the deadline of spec freezing.
So I change the target to 1.9.x.
If you want 1.9.2 to include the feature, please discuss right now.


Yusuke Endoh

Actions #6

Updated by marcandre (Marc-Andre Lafortune) about 14 years ago

  • Target version changed from 2.0.0 to 1.9.2

I feel the spec for 1.9.2 has been quite clear for 5 years ... follow RFC 3986!

Integrating some of the features of the addressable gem can be discussed later.

Do we have to wait for Akira Yamada, the official maintainer of this library?

Actions #7

Updated by naruse (Yui NARUSE) about 14 years ago

I object to target 1.9.2.
Following RFC 3986 makes some incompatibilities.
It shouldn't be done without consideration.

Actions #8

Updated by duerst (Martin Dürst) about 14 years ago

Hello Yui,

Is there a list of incompatibilities, or can you make one?

Regards, Martin.


#-# Martin J. Dürst, Professor, Aoyama Gakuin University
#-# http://www.sw.it.aoyama.ac.jp

Actions #9

Updated by naruse (Yui NARUSE) about 14 years ago

2010/3/25 "Martin J. Dürst" :

Is there a list of incompatibilities, or can you make one?

Some structures of the syntax is changed in RFC 3986.
This breaks URI::REGEXP::PATTERN::TOPLABEL and some constants.


NARUSE, Yui

Actions #10

Updated by znz (Kazuhiro NISHIYAMA) almost 14 years ago

  • Status changed from Open to Assigned
  • Target version changed from 1.9.2 to 2.0.0
Actions #11

Updated by hedgehog (Hedge Hog) almost 14 years ago

Rather than reinvent anything. Consider employing an FFI interface to uriparser:

http://uriparser.sourceforge.net/

not sure if there is a port for windows or if an equivalent windows lib is available?

Actions #12

Updated by luislavena (Luis Lavena) almost 14 years ago

On Wed, May 12, 2010 at 8:01 PM, Hedge Hog wrote:

Issue #2542 has been updated by Hedge Hog.

Rather than reinvent anything.  Consider employing an FFI interface to uriparser:

http://uriparser.sourceforge.net/

not sure if there is a port for windows or if an equivalent windows lib is available?

we will not only depend on uriparser C library but also will require
libcpptest to be able to configure and compile uriparser.


Luis Lavena
AREA 17


Perfection in design is achieved not when there is nothing more to add,
but rather when there is nothing more to take away.
Antoine de Saint-Exupéry

Actions #13

Updated by philo (Philippe Lucas) over 13 years ago

It depends of libcpptest only for test so you can build the package with '--disable-test'.

Actions #14

Updated by naruse (Yui NARUSE) over 13 years ago

  • Assignee changed from akira (akira yamada) to naruse (Yui NARUSE)

I come to think uri lib should move to RFC 3986 even if it breaks some compatibility.
But I don't want that new implementation/spec will be also a white box like now.
So I think:

  • keep current URI::REGEXP, URI::Parser and so on.
  • at least URI.parse doesn't use URI::Parser but use new implementation.

How about this?

Updated by nikosd (Nikos Dimitrakopoulos) over 11 years ago

Are there any plans for actually fixing this? Not sure I can help, and no troll appetite - just asking :)

Updated by mame (Yusuke Endoh) over 11 years ago

  • Target version changed from 1.9.4 to 2.6

Naruse-san, could you please answer to Nikos?

I'm setting to next minor, but if you are willing to do anything to 2.0.0, and if the impact is so small, I may accept.

--
Yusuke Endoh

Updated by naruse (Yui NARUSE) almost 11 years ago

Just an experimental implementation:
http://github.com/nurse/url

Updated by naruse (Yui NARUSE) almost 10 years ago

  • Status changed from Assigned to Closed
  • % Done changed from 0 to 100

Applied in changeset r46491.


support RFC3986 [Feature #2542]

  • lib/uri/common.rb (URI::REGEXP): move to lib/uri/rfc2396_parser.rb.

  • lib/uri/common.rb (URI::Parser): ditto.

  • lib/uri/common.rb (URI.split): use RFC3986_Parser.

  • lib/uri/common.rb (URI.parse): ditto.

  • lib/uri/common.rb (URI.join): ditto.

  • lib/uri/common.rb (URI.extract): deprecated.

  • lib/uri/common.rb (URI.regexp): ditto.

  • lib/uri/rfc2396_parser.rb: added.

  • lib/uri/rfc3986_parser.rb: added.

Updated by tenderlovemaking (Aaron Patterson) over 9 years ago

r46491 broke this script:

require 'uri'

thing = URI.parse 'http://example.com'
thing.query = 'location[]=1&location[]=2&age_group[]=2'

Before r46491 it would set the query, after r46491, it raises an exception.

Is this a bug in the new implementation? Or should I be doing something different? (I pulled this from the Rails tests, so I'm not 100% sure what it is actually for)

Updated by bitsweat (Jeremy Daer) over 9 years ago

In RFC 3986, square brackets are no longer allowed in the query part.

Source of the unescaped brackets, in this case: https://github.com/brynary/rack-test/blob/master/lib/rack/test/utils.rb

This may become a common issue since plenty of code uses URI.parse and expects its more permissive RFC 2396 parsing.

Updated by zzak (zzak _) over 9 years ago

I think #9990 is related /cc @naruse (Yui NARUSE) @JK @tenderlove

Updated by naruse (Yui NARUSE) over 9 years ago

I'm considering to change the error policy of URI library, for example:
BEFORE: raise error if invalid characters exist
AFTER: percent-escape them

Updated by lengarvey (Leonard Garvey) over 9 years ago

I've implemented something similar to that policy in the following gist: https://gist.github.com/lengarvey/31983eac6664351ed16d

It's a very basic naive implementation but I believe it roughly does what we need URI.parse to do.

Updated by nobu (Nobuyoshi Nakada) over 9 years ago

  • Related to Bug #10402: URI regression in 2.2.0-preview1 (bad URI(is not URI?): URI::InvalidURIError) added

Updated by nobu (Nobuyoshi Nakada) over 9 years ago

  • Related to Bug #9990: URI.parse and URI.encode use different RFCs added
Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0