Bug #454

URI does not follow the last rfc about uri syntax

Added by Cyrille Faucheux almost 7 years ago. Updated about 4 years ago.

[ruby-core:18335]
Status:Rejected
Priority:Normal
Assignee:akira yamada
ruby -v:- Backport:

Description

=begin
According to the last rcf about uri syntax (http://www.ietf.org/rfc/rfc3986.txt), i have found at least two "bugs" in the uri library.

The "#" character is a delimiter and shouldn't be escaped. In the current implementation, it is, so the resulting escaped uri is no more the good one.

Example :

URI.escape('http://www.example.com/the page.html#fragment')

"http://www.example.com/the%20page.html%23fragment"

As a quick patch, the "#" character must be added to the URI::REGEXP::RESERVED regexp.

In the same way, URI::REGEXP::UNRESERVED specify characters that are not marked as unreserved by the rfc.

URI : UNRESERVED = "-_.!~*'()#{ALNUM}"

RFC : unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~"
=end

454.patch Magnifier (1.62 KB) Jonas Witt, 12/19/2008 03:44 AM

History

#1 Updated by Jonas Witt over 6 years ago

=begin
lib/uri/common.rb currently references RFCs 2732 and 2396. RFC 3986, as linked to by the original poster, obsoletes both RFCs.

The attached patch adjusts the definitions of UNRESERVED and RESERVED to comply with RFC 3986, thereby fixing the "#" issue, among others:

URI.escape('http://www.example.com/the page.html#fragment')
=> "http://www.example.com/the%20page.html#fragment"

This bug is filed against Ruby 1.8, but it's the same on the 1.9.1 branch and the same patch applies.

=end

#2 Updated by Shyouhei Urabe over 6 years ago

  • Assignee set to akira yamada
  • ruby -v set to -

=begin

=end

#3 Updated by akira yamada over 6 years ago

  • Status changed from Open to Rejected

=begin
URI.escape does not parse the argument string as an URI.
The method only replace all UNSAFE chars in the string.
=end

Also available in: Atom PDF