Project

General

Profile

Actions

Bug #9974

closed

Regression: URI.parse allows invalid URIs

Added by ggilder (Gabriel Gilder) almost 10 years ago. Updated over 7 years ago.

Status:
Rejected
Assignee:
-
Target version:
-
ruby -v:
ruby 2.2.0dev (2014-06-23 trunk 46517) [x86_64-darwin13]
[ruby-core:63304]

Description

$ ruby -v
ruby 2.2.0dev (2014-06-23 trunk 46517) [x86_64-darwin13]
$ ruby -ruri -e "puts URI.parse('http://test_example')"
http://test_example

Compare to Ruby 2.1.2:

$ ruby -v
ruby 2.1.2p95 (2014-05-08 revision 45877) [x86_64-darwin13.0]
$ ruby -ruri -e "puts URI.parse('http://test_example')"
/Users/gabriel/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/uri/generic.rb:214:in `initialize': the scheme http does not accept registry part: test_example (or bad hostname?) (URI::InvalidURIError)
        from /Users/gabriel/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/uri/http.rb:84:in `initialize'
        from /Users/gabriel/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/uri/common.rb:214:in `new'
        from /Users/gabriel/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/uri/common.rb:214:in `parse'
        from /Users/gabriel/.rvm/rubies/ruby-2.1.2/lib/ruby/2.1.0/uri/common.rb:747:in `parse'
        from -e:1:in `<main>'

This appears to be a regression as hostnames cannot legally contain underscores: http://en.wikipedia.org/wiki/Hostname#Restrictions%5Fon%5Fvalid%5Fhost%5Fnames


Related issues 1 (0 open1 closed)

Related to Ruby master - Bug #8241: If uri host-part has underscore ( '_' ), 'URI#parse' raise 'URI::InvalidURIError'Closedakira (akira yamada)Actions

Updated by ggilder (Gabriel Gilder) almost 10 years ago

Oops, sorry for bad code formatting. In any case you should see that in Ruby 2.1.2 URI::InvalidURIError is raised.

Updated by nobu (Nobuyoshi Nakada) almost 10 years ago

  • Description updated (diff)

Updated by naruse (Yui NARUSE) almost 10 years ago

  • Priority changed from 5 to Normal

Thank you for checking trunk version,

You may know, RFC3986 allows underscores in reg-name though DNS name doesn't include underscore.

   host        = IP-literal / IPv4address / reg-name
   reg-name    = *( unreserved / pct-encoded / sub-delims )
   pct-encoded   = "%" HEXDIG HEXDIG

   unreserved    = ALPHA / DIGIT / "-" / "." / "_" / "~"
   reserved      = gen-delims / sub-delims
   gen-delims    = ":" / "/" / "?" / "#" / "[" / "]" / "@"
   sub-delims    = "!" / "$" / "&" / "'" / "(" / ")"
                 / "*" / "+" / "," / ";" / "="

Updated by naruse (Yui NARUSE) almost 10 years ago

  • Related to Bug #8241: If uri host-part has underscore ( '_' ), 'URI#parse' raise 'URI::InvalidURIError' added

Updated by coldnebo (Larry Kyrala) over 7 years ago

The URI abstraction speaks to RFC3986 (DNS) more directly than RFC952 (hostnames). The confusion is understandable.

Still, standards-based systems exist right now that allow this (e.g. we have an nginx deployed application that chose an underscore 4th-level name, but it works every place else).

But any Ruby clients that attempt to integrate against this endpoint fail. This puts Ruby at a disadvantage. I'd argue that the existence of Python/Java solutions etc. may not be 'legal', but is at least 'de facto'.

Updated by naruse (Yui NARUSE) over 7 years ago

  • Status changed from Open to Rejected

Larry Kyrala wrote:

The URI abstraction speaks to RFC3986 (DNS) more directly than RFC952 (hostnames). The confusion is understandable.

Still, standards-based systems exist right now that allow this (e.g. we have an nginx deployed application that chose an underscore 4th-level name, but it works every place else).

But any Ruby clients that attempt to integrate against this endpoint fail. This puts Ruby at a disadvantage. I'd argue that the existence of Python/Java solutions etc. may not be 'legal', but is at least 'de facto'.

JavaScript's new URL("http://test_example") accepts underscore.
Therefore I don't think error is de facto.

Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0Like0Like0