Project

General

Profile

Feature #2542

URI lib should be updated to RFC 3986

Added by marcandre (Marc-Andre Lafortune) over 7 years ago. Updated about 3 years ago.

Status:
Closed
Priority:
Normal
Target version:
[ruby-core:27360]

Description

=begin
RFC 2396 has been obsolete for nearly 5 years now.

It was replaced by RFC 3986 which aims at clarifying aspects that were not previously clear.
=end


Related issues

Related to Ruby trunk - Bug #4110: ホスト名の先頭が数字であるとき、WEBrickのテストでErrorが出るClosed2010-12-02
Related to Ruby trunk - Bug #4673: URI::Generic registry is not properly set.Feedback2011-05-12
Related to Ruby trunk - Bug #8352: uri squeezes a sequence of slashes in merging paths when it shouldn'tOpen2013-05-02
Related to Ruby trunk - Bug #10402: URI regression in 2.2.0-preview1 (bad URI(is not URI?): URI::InvalidURIError)Closed
Related to Ruby trunk - Bug #9990: URI.parse and URI.encode use different RFCsAssigned2014-06-28

Associated revisions

Revision 46491
Added by naruse (Yui NARUSE) about 3 years ago

support RFC3986 [Feature #2542]

* lib/uri/common.rb (URI::REGEXP): move to lib/uri/rfc2396_parser.rb.

* lib/uri/common.rb (URI::Parser): ditto.

* lib/uri/common.rb (URI.split): use RFC3986_Parser.

* lib/uri/common.rb (URI.parse): ditto.

* lib/uri/common.rb (URI.join): ditto.

* lib/uri/common.rb (URI.extract): deprecated.

* lib/uri/common.rb (URI.regexp): ditto.

* lib/uri/rfc2396_parser.rb: added.

* lib/uri/rfc3986_parser.rb: added.

Revision 46491
Added by naruse (Yui NARUSE) about 3 years ago

support RFC3986 [Feature #2542]

* lib/uri/common.rb (URI::REGEXP): move to lib/uri/rfc2396_parser.rb.

* lib/uri/common.rb (URI::Parser): ditto.

* lib/uri/common.rb (URI.split): use RFC3986_Parser.

* lib/uri/common.rb (URI.parse): ditto.

* lib/uri/common.rb (URI.join): ditto.

* lib/uri/common.rb (URI.extract): deprecated.

* lib/uri/common.rb (URI.regexp): ditto.

* lib/uri/rfc2396_parser.rb: added.

* lib/uri/rfc3986_parser.rb: added.

Revision 46491
Added by naruse (Yui NARUSE) about 3 years ago

support RFC3986 [Feature #2542]

* lib/uri/common.rb (URI::REGEXP): move to lib/uri/rfc2396_parser.rb.

* lib/uri/common.rb (URI::Parser): ditto.

* lib/uri/common.rb (URI.split): use RFC3986_Parser.

* lib/uri/common.rb (URI.parse): ditto.

* lib/uri/common.rb (URI.join): ditto.

* lib/uri/common.rb (URI.extract): deprecated.

* lib/uri/common.rb (URI.regexp): ditto.

* lib/uri/rfc2396_parser.rb: added.

* lib/uri/rfc3986_parser.rb: added.

Revision 46491
Added by naruse (Yui NARUSE) about 3 years ago

support RFC3986 [Feature #2542]

* lib/uri/common.rb (URI::REGEXP): move to lib/uri/rfc2396_parser.rb.

* lib/uri/common.rb (URI::Parser): ditto.

* lib/uri/common.rb (URI.split): use RFC3986_Parser.

* lib/uri/common.rb (URI.parse): ditto.

* lib/uri/common.rb (URI.join): ditto.

* lib/uri/common.rb (URI.extract): deprecated.

* lib/uri/common.rb (URI.regexp): ditto.

* lib/uri/rfc2396_parser.rb: added.

* lib/uri/rfc3986_parser.rb: added.

Revision 46680
Added by naruse (Yui NARUSE) about 3 years ago

* lib/uri/generic.rb (URI::Generic#query=): remove validation, just
escape. [Feature #2542]

* lib/uri/generic.rb (URI::Generic#fragment=): ditto.

* lib/uri/generic.rb (URI::Generic#check_query): removed.

* lib/uri/generic.rb (URI::Generic#set_query): ditto.

* lib/uri/generic.rb (URI::Generic#check_fragment): ditto.

* lib/uri/generic.rb (URI::Generic#set_fragment): ditto.

Revision 46680
Added by naruse (Yui NARUSE) about 3 years ago

* lib/uri/generic.rb (URI::Generic#query=): remove validation, just
escape. [Feature #2542]

* lib/uri/generic.rb (URI::Generic#fragment=): ditto.

* lib/uri/generic.rb (URI::Generic#check_query): removed.

* lib/uri/generic.rb (URI::Generic#set_query): ditto.

* lib/uri/generic.rb (URI::Generic#check_fragment): ditto.

* lib/uri/generic.rb (URI::Generic#set_fragment): ditto.

Revision 46680
Added by naruse (Yui NARUSE) about 3 years ago

* lib/uri/generic.rb (URI::Generic#query=): remove validation, just
escape. [Feature #2542]

* lib/uri/generic.rb (URI::Generic#fragment=): ditto.

* lib/uri/generic.rb (URI::Generic#check_query): removed.

* lib/uri/generic.rb (URI::Generic#set_query): ditto.

* lib/uri/generic.rb (URI::Generic#check_fragment): ditto.

* lib/uri/generic.rb (URI::Generic#set_fragment): ditto.

Revision 46680
Added by naruse (Yui NARUSE) about 3 years ago

* lib/uri/generic.rb (URI::Generic#query=): remove validation, just
escape. [Feature #2542]

* lib/uri/generic.rb (URI::Generic#fragment=): ditto.

* lib/uri/generic.rb (URI::Generic#check_query): removed.

* lib/uri/generic.rb (URI::Generic#set_query): ditto.

* lib/uri/generic.rb (URI::Generic#check_fragment): ditto.

* lib/uri/generic.rb (URI::Generic#set_fragment): ditto.

History

#1 Updated by marcandre (Marc-Andre Lafortune) over 7 years ago

  • Subject changed from URI lib should be updated to RFC 39886 to URI lib should be updated to RFC 3986

#3 Updated by duerst (Martin Dürst) over 7 years ago

No, RFC 3986 (URI) will NOT be updated. RFC 3987 (IRI), in due time,
will be updated. See also
http://www.ietf.org/ibin/c5i?mid=6&rid=49&gid=0&k1=934&k2=7294&tid=1262671757.

Regards, Martin.


#-# Martin J. Dürst, Professor, Aoyama Gakuin University
#-# http://www.sw.it.aoyama.ac.jp mailto:duerst@it.aoyama.ac.jp

#4 Updated by naruse (Yui NARUSE) over 7 years ago

Ah, Martin is right, it's RFC 3987 and RFC 3986 will be still alive.

Anyway Bob Aman introduces an alternative library named Addressable.
http://addressable.rubyforge.org/
It looks good but some incompatibilities for current URI library.
I'll suggest to bundle Addressable and obsolete current URI lib,
but I have to plan its migration path.

#5 Updated by mame (Yusuke Endoh) over 7 years ago

  • Target version changed from 1.9.2 to 2.0.0

Hi,

I'll suggest to bundle Addressable and obsolete current URI lib,
but I have to plan its migration path.

This ticket seems to need much work.
I guess we can't make the deadline of spec freezing.
So I change the target to 1.9.x.
If you want 1.9.2 to include the feature, please discuss right now.


Yusuke Endoh mame@tsg.ne.jp

#6 Updated by marcandre (Marc-Andre Lafortune) over 7 years ago

  • Target version changed from 2.0.0 to 1.9.2

I feel the spec for 1.9.2 has been quite clear for 5 years ... follow RFC 3986!

Integrating some of the features of the addressable gem can be discussed later.

Do we have to wait for Akira Yamada, the official maintainer of this library?

#7 Updated by naruse (Yui NARUSE) over 7 years ago

I object to target 1.9.2.
Following RFC 3986 makes some incompatibilities.
It shouldn't be done without consideration.

#8 Updated by duerst (Martin Dürst) over 7 years ago

Hello Yui,

Is there a list of incompatibilities, or can you make one?

Regards, Martin.


#-# Martin J. Dürst, Professor, Aoyama Gakuin University
#-# http://www.sw.it.aoyama.ac.jp mailto:duerst@it.aoyama.ac.jp

#9 Updated by naruse (Yui NARUSE) over 7 years ago

2010/3/25 "Martin J. Dürst" duerst@it.aoyama.ac.jp:

Is there a list of incompatibilities, or can you make one?

Some structures of the syntax is changed in RFC 3986.
This breaks URI::REGEXP::PATTERN::TOPLABEL and some constants.


NARUSE, Yui
naruse@airemix.jp

#10 Updated by znz (Kazuhiro NISHIYAMA) over 7 years ago

  • Status changed from Open to Assigned
  • Target version changed from 1.9.2 to 2.0.0

#11 Updated by hedgehog (Hedge Hog) about 7 years ago

Rather than reinvent anything. Consider employing an FFI interface to uriparser:

http://uriparser.sourceforge.net/

not sure if there is a port for windows or if an equivalent windows lib is available?

#12 Updated by luislavena (Luis Lavena) about 7 years ago

On Wed, May 12, 2010 at 8:01 PM, Hedge Hog redmine@ruby-lang.org wrote:

Issue #2542 has been updated by Hedge Hog.

Rather than reinvent anything.  Consider employing an FFI interface to uriparser:

http://uriparser.sourceforge.net/

not sure if there is a port for windows or if an equivalent windows lib is available?

we will not only depend on uriparser C library but also will require
libcpptest to be able to configure and compile uriparser.


Luis Lavena
AREA 17


Perfection in design is achieved not when there is nothing more to add,
but rather when there is nothing more to take away.
Antoine de Saint-Exupéry

#13 Updated by philo (Philippe Lucas) almost 7 years ago

It depends of libcpptest only for test so you can build the package with '--disable-test'.

#14 Updated by naruse (Yui NARUSE) over 6 years ago

  • Assignee changed from akira (akira yamada) to naruse (Yui NARUSE)

I come to think uri lib should move to RFC 3986 even if it breaks some compatibility.
But I don't want that new implementation/spec will be also a white box like now.
So I think:

  • keep current URI::REGEXP, URI::Parser and so on.
  • at least URI.parse doesn't use URI::Parser but use new implementation.

How about this?

#16 [ruby-core:49658] Updated by nikosd (Nikos Dimitrakopoulos) over 4 years ago

Are there any plans for actually fixing this? Not sure I can help, and no troll appetite - just asking :)

#17 [ruby-core:49975] Updated by mame (Yusuke Endoh) over 4 years ago

  • Target version changed from 1.9.4 to next minor

Naruse-san, could you please answer to Nikos?

I'm setting to next minor, but if you are willing to do anything to 2.0.0, and if the impact is so small, I may accept.

--
Yusuke Endoh mame@tsg.ne.jp

#18 [ruby-core:55190] Updated by naruse (Yui NARUSE) about 4 years ago

Just an experimental implementation:
http://github.com/nurse/url

#19 [ruby-core:63274] Updated by naruse (Yui NARUSE) about 3 years ago

  • Status changed from Assigned to Closed
  • % Done changed from 0 to 100

Applied in changeset r46491.


support RFC3986 [Feature #2542]

  • lib/uri/common.rb (URI::REGEXP): move to lib/uri/rfc2396_parser.rb.

  • lib/uri/common.rb (URI::Parser): ditto.

  • lib/uri/common.rb (URI.split): use RFC3986_Parser.

  • lib/uri/common.rb (URI.parse): ditto.

  • lib/uri/common.rb (URI.join): ditto.

  • lib/uri/common.rb (URI.extract): deprecated.

  • lib/uri/common.rb (URI.regexp): ditto.

  • lib/uri/rfc2396_parser.rb: added.

  • lib/uri/rfc3986_parser.rb: added.

#20 [ruby-core:63450] Updated by tenderlovemaking (Aaron Patterson) about 3 years ago

r46491 broke this script:

require 'uri'

thing = URI.parse 'http://example.com'
thing.query = 'location[]=1&location[]=2&age_group[]=2'

Before r46491 it would set the query, after r46491, it raises an exception.

Is this a bug in the new implementation? Or should I be doing something different? (I pulled this from the Rails tests, so I'm not 100% sure what it is actually for)

#21 [ruby-core:63452] Updated by bitsweat (Jeremy Daer) about 3 years ago

In RFC 3986, square brackets are no longer allowed in the query part.

Source of the unescaped brackets, in this case: https://github.com/brynary/rack-test/blob/master/lib/rack/test/utils.rb

This may become a common issue since plenty of code uses URI.parse and expects its more permissive RFC 2396 parsing.

#22 [ruby-core:63461] Updated by zzak (Zachary Scott) about 3 years ago

I think #9990 is related /cc naruse (Yui NARUSE) @JK @tenderlove

#23 [ruby-core:63471] Updated by naruse (Yui NARUSE) about 3 years ago

I'm considering to change the error policy of URI library, for example:
BEFORE: raise error if invalid characters exist
AFTER: percent-escape them

#24 [ruby-core:63473] Updated by lengarvey (Leonard Garvey) about 3 years ago

I've implemented something similar to that policy in the following gist: https://gist.github.com/lengarvey/31983eac6664351ed16d

It's a very basic naive implementation but I believe it roughly does what we need URI.parse to do.

#25 [ruby-core:65803] Updated by nobu (Nobuyoshi Nakada) almost 3 years ago

  • Related to Bug #10402: URI regression in 2.2.0-preview1 (bad URI(is not URI?): URI::InvalidURIError) added

#26 [ruby-core:65805] Updated by nobu (Nobuyoshi Nakada) almost 3 years ago

  • Related to Bug #9990: URI.parse and URI.encode use different RFCs added

Also available in: Atom PDF