Bug #3788
closedURI cannot parse IPv6 addresses propertly
Description
=begin
require 'uri'
u = URI::parse( 'http://[::1]:8080/test' )
u.host
=> "[::1]"
**** THIS SHOULD READ ::1 ****
irb(main):007:0> u.host = '127.0.0.1'
=> "127.0.0.1"
irb(main):008:0> u.host = '::1'
URI::InvalidComponentError: bad component(expected host component): ::1
from /usr/lib/ruby/1.9.1/uri/generic.rb:388:in check_host' from /usr/lib/ruby/1.9.1/uri/generic.rb:402:in
host='
from (irb):8
from /usr/bin/irb1.9.1:12:in `'
irb(main):009:0> u.host = 'localhost'
=> "localhost"
If settings host to ::1 doesn't work, then setting it to [::1] should produce ::1 hostname. [::1] clearly would not resolve and hostnames should resolve.
=end
Updated by naruse (Yui NARUSE) over 14 years ago
- Status changed from Open to Rejected
=begin
uri lib's structure follows URI RFC; your expectation is wrong.
=end
Updated by darix (Marcus Rückert) over 14 years ago
=begin
require 'uri'
require 'net/http'
url = URI.parse( 'http://[::1]:8080/test/Test' )
Net::HTTP.get( url )
I think something like that should work then, but it doesnt. it gives:
/usr/lib64/ruby/1.9.1/net/http.rb:644:in initialize': getaddrinfo: Name or service not known (SocketError) /usr/lib64/ruby/1.8/net/http.rb:560:in
initialize': getaddrinfo: Name or service not known (SocketError)
Or do we expect users to manually mangle the hostname after the the uri lib parsed it?
=end
Updated by naruse (Yui NARUSE) over 14 years ago
=begin
The machine disables IPv6.
=end
Updated by darix (Marcus Rückert) over 14 years ago
=begin
no it doesnt.
if i use ipv6-localhost instead of [::1] it works.
$ cat tt.rb
#!/usr/bin/ruby
vim: set sw=2 sts=2 et tw=80 :¶
require 'uri'
require 'net/http'
%w{[::1] ipv6-localhost}.each do |hostname|
begin
test_url = 'http://' + hostname + ':80/test/Test'
puts test_url
url = URI.parse( test_url )
puts url.host
puts Net::HTTP.get( url )
rescue Exception => ex
p ex
end
end
$ ruby -v tt.rb
ruby 1.8.7 (2010-06-23 patchlevel 299) [x86_64-linux]
http://[::1]:80/test/Test
[::1]
#<SocketError: getaddrinfo: Name or service not known>
http://ipv6-localhost:80/test/Test
ipv6-localhost
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<title>404 - Not Found</title>
</head>
<body>
<h1>404 - Not Found</h1>
</body>
</html>
same result with ruby 1.9.2p0 (2010-08-18 revision 29036) [x86_64-linux]
=end
Updated by naruse (Yui NARUSE) over 14 years ago
=begin
Hmm, your point has a reson. I consider this.
http://msdn.microsoft.com/en-us/library/system.uri.dnssafehost.aspx
=end
Updated by darix (Marcus Rückert) over 14 years ago
=begin
so can we reopen the bug so it wont get lost?:)
=end
Updated by adamm (Adam Majer) over 14 years ago
=begin
The original argument is quite simple.
-
URI RFC defines structure of URI (text) so it can be parsed by Ruby's URI (class) and other implementation. RFC define the interop.
-
Purpose of URI Ruby class is to permit access to read, edit and use such URI transparently. URI class should deal with any encoding/unencoding issues.
-
Not handling special escaping of the IPv6 literal internally in the URI Ruby class will result in spaghetti code and code duplication elsewhere. Marcus' example is exactly what happens not only inside Ruby codebase, but in other applications. The expectation is a hostname can be passed to other classes. For example,
url = URI.parse( 'http://[::1]:8080/test/Test' )
TCPSocket.new( url.host, url.port )
So where should this be fixed? In every piece of code that uses URI to connect or in URI helper class?
Here's another library (C++) that does this the way I expected it to,
http://doc.trolltech.com/4.6/qurl.html#host
QUrl u( "http://[::1]:8080/test" );
u.host() => "::1"
In my opinion, adding dnssafehost to URI is bad design. Instead that should be the job the DNS class. "DNS Safe" has impact on protocol level, not API level.
My expectation was that URI.host should return unescaped hostname/IP and URI.host= should take unescaped as argument and escape them to produce a valid URI. This interface could then be easily extended to work with IRI (International domains).
maybe the solution is to rename current members host and host= to host_rfc3986 and host_rfc3986=. Similar syntax already exists in other classes like Time.
Alternatively, have URL.host return a new Host class (non-string) that deals with encoding/unencoding per URI RFC.
Of course, it is up to you to decide how you chose to design this. Again, RFC does not define what URL.host returns. Instead it defines host - a specific part of URL string.
=end
Updated by akr (Akira Tanaka) over 14 years ago
=begin
2010/9/8 Adam Majer redmine@ruby-lang.org:
Issue #3788 has been updated by Adam Majer.
In my opinion, adding dnssafehost to URI is bad design.
I think adding a method is a good design.
My expectation was that URI.host should return unescaped hostname/IP and URI.host= should take unescaped as argument and escape them to produce a valid URI. This interface could then be easily extended to work with IRI (International domains).
maybe the solution is to rename current members host and host= to host_rfc3986 and host_rfc3986=. Similar syntax already exists in other classes like Time.
It introduces an incompatibility.
Also, the primitive method to get/set the host part will be version dependent.
I think adding a new method for deal brackets is better.
% svn diff --diff-cmd diff -x '-u -p'
Index: lib/uri/generic.rb
--- lib/uri/generic.rb (revision 29195)
+++ lib/uri/generic.rb (working copy)
@@ -412,6 +412,38 @@ module URI
v
end
-
extract the host part of the URI and unwrap brackets for IPv6 addresses.¶
-
This method is same as URI::Generic#host except¶
-
brackets for IPv6 (andn future IP) addresses are removed.¶
-
u = URI("http://[::1]/bar")¶
-
p u.hostname #=> "::1"¶
-
p u.host #=> "[::1]"¶
- def hostname
-
v = self.host
-
/\A\[(.*)\]\z/ =~ v ? $1 : v
- end
-
set the host part of the URI as the argument with brackets for¶
IPv6 addresses.
-
This method is same as URI::Generic#host= except¶
-
the argument can be bare IPv6 address.¶
-
u = URI("http://foo/bar")¶
-
p u.to_s #=> "http://foo/bar"¶
-
u.hostname = "::1"¶
-
p u.to_s #=> "http://[::1]/bar"¶
-
If the arugument seems IPv6 address,¶
-
it is wrapped by brackets.¶
- def hostname=(v)
-
v = "[#{v}]" if /\A\[.*\]\z/ !~ v && /:/ =~ v
-
self.host = v
- end
- def check_port(v)
return v unless v
http://redmine.ruby-lang.org/issues/show/3788
--
Tanaka Akira
=end
Updated by darix (Marcus Rückert) over 14 years ago
=begin
On 2010-09-10 06:44:10 +0900, Tanaka Akira wrote:
2010/9/8 Adam Majer redmine@ruby-lang.org:
Issue #3788 has been updated by Adam Majer.
In my opinion, adding dnssafehost to URI is bad design.
I think adding a method is a good design.
My expectation was that URI.host should return unescaped hostname/IP
and URI.host= should take unescaped as argument and escape them to
produce a valid URI. This interface could then be easily extended to
work with IRI (International domains).maybe the solution is to rename current members host and host= to
host_rfc3986 and host_rfc3986=. Similar syntax already exists in
other classes like Time.It introduces an incompatibility.
Also, the primitive method to get/set the host part will be version dependent.I think adding a new method for deal brackets is better.
doesnt that mean you will need to fix all of the users of the function?
with my net/http example above you can somewhat control it.
but e.g. ruby -r open-uri 'p open("http://[::1]:8080/test/test")'
doesnt give me control over which function is used.
i think the better approach might be to let url.host return "::1".
and just encode it when constructing a new url.
another option might be monkeypatching the string object returned by
url.host and add a "dnssafe" method like shown in the msdn article.
then the resolv class could get the dnssafe version when needed. but it
still might lead to trouble when the hostname is passed to other C
extensions.
darix
--
openSUSE - SUSE Linux is my linux
openSUSE is good for you
www.opensuse.org
=end
Updated by akr (Akira Tanaka) over 14 years ago
=begin
2010/9/10 Marcus Rueckert darix@opensu.se:
doesnt that mean you will need to fix all of the users of the function?
with my net/http example above you can somewhat control it.
All occurrences of URI::Generic#host should be examined.
I guess there may be usages which needs the brackets.
At least, one of the use in URI::Generic#find_proxy (defined in open-uri.rb)
is not clear which is suitable between the brackets are removed or not.
% svn diff --diff-cmd diff -x '-u -p'
Index: lib/open-uri.rb
--- lib/open-uri.rb (revision 29207)
+++ lib/open-uri.rb (working copy)
@@ -263,17 +263,17 @@ module OpenURI
# HTTP or HTTPS
if proxy
if proxy_user && proxy_pass
-
klass = Net::HTTP::Proxy(proxy_uri.host, proxy_uri.port,
proxy_user, proxy_pass)
-
klass = Net::HTTP::Proxy(proxy_uri.hostname,
proxy_uri.port, proxy_user, proxy_pass)
else
-
klass = Net::HTTP::Proxy(proxy_uri.host, proxy_uri.port)
-
klass = Net::HTTP::Proxy(proxy_uri.hostname, proxy_uri.port) end end
-
target_host = target.host
-
elsetarget_host = target.hostname target_port = target.port request_uri = target.request_uri
# FTP over HTTP proxy
-
target_host = proxy_uri.host
-
target_host = proxy_uri.hostname target_port = proxy_uri.port request_uri = target.to_s if proxy_user && proxy_pass
@@ -736,10 +736,10 @@ module URI
proxy_uri = ENV[name] || ENV[name.upcase]
end
-
if proxy_uri && self.host
-
if proxy_uri && self.hostname require 'socket' begin
-
addr = IPSocket.getaddress(self.host)
-
addr = IPSocket.getaddress(self.hostname) proxy_uri = nil if /\A127\.|\A::1\z/ =~ addr rescue SocketError end
@@ -804,7 +804,7 @@ module URI
# The access sequence is defined by RFC 1738
ftp = Net::FTP.new
-
ftp.connect(self.host, self.port)
-
ftp.connect(self.hostname, self.port) ftp.passive = true if !options[:ftp_active_mode] # todo: extract user/passwd from .netrc. user = 'anonymous'
Index: lib/net/http.rb¶
--- lib/net/http.rb (revision 29207)
+++ lib/net/http.rb (working copy)
@@ -64,7 +64,7 @@ module Net #:nodoc:
# require 'uri'
#
# url = URI.parse('http://www.example.com/index.html')
-
res = Net::HTTP.start(url.host, url.port) {|http|¶
@@ -75,7 +75,7 @@ module Net #:nodoc:
#
# url = URI.parse('http://www.example.com/index.html')
# req = Net::HTTP::Get.new(url.path)
-
res = Net::HTTP.start(url.host, url.port) {|http|¶
@@ -101,7 +101,7 @@ module Net #:nodoc:
# req = Net::HTTP::Post.new(url.path)
# req.basic_auth 'jack', 'pass'
# req.set_form_data({'from' => '2005-01-01', 'to' => '2005-03-31'}, ';')
-
res = Net::HTTP.new(url.host, url.port).start {|http|¶
http.request(req) }
-
res = Net::HTTP.new(url.hostname, url.port).start {|http|¶
http.request(req) }
# case res
# when Net::HTTPSuccess, Net::HTTPRedirection
# # OK
@@ -390,7 +390,7 @@ module Net #:nodoc:
}
else
uri = uri_or_host
-
new(uri.host, uri.port).start {|http|
-
new(uri.hostname, uri.port).start {|http| return http.request_get(uri.request_uri, &block) } end
@@ -415,7 +415,7 @@ module Net #:nodoc:
req = Post.new(url.path)
req.form_data = params
req.basic_auth url.user, url.password if url.user
-
new(url.host, url.port).start {|http|
-
endnew(url.hostname, url.port).start {|http| http.request(req) }
--
Tanaka Akira
=end
Updated by naruse (Yui NARUSE) over 14 years ago
=begin
(2010/09/10 6:44), Tanaka Akira wrote:
2010/9/8 Adam Majerredmine@ruby-lang.org:
Issue #3788 has been updated by Adam Majer.
In my opinion, adding dnssafehost to URI is bad design.
I think adding a method is a good design.
I agree with akr's patch; but URI#host should have a description that
it doesn't remove brackets.
--
NARUSE, Yui naruse@airemix.jp
=end
Updated by akr (Akira Tanaka) over 14 years ago
=begin
2010/9/12 NARUSE, Yui naruse@airemix.jp:
I agree with akr's patch; but URI#host should have a description that
it doesn't remove brackets.
I see.
Index: lib/uri/generic.rb¶
--- lib/uri/generic.rb (revision 29207)
+++ lib/uri/generic.rb (working copy)
@@ -205,8 +205,31 @@
self.set_path('') if @path && @opaque # (see RFC2396 Section 5.2)
self.set_port(self.default_port) if self.default_port && @port
end
+
attr_reader :scheme
+
-
returns the host component of the URI.¶
-
URI("http://foo/bar/baz").host #=> "foo"¶
-
It returns nil if no host component.¶
-
URI("mailto:foo@example.org").host #=> nil¶
-
The component doesn't contains the port number.¶
-
URI("http://foo:8080/bar/baz").host #=> "foo"¶
-
Since IPv6 addresses are wrapped by brackets in URIs,¶
-
this method returns IPv6 addresses wrapped by brackets.¶
-
This form is not appropriate to pass socket methods such as¶
TCPSocket.open.
-
If unwrapped host names are required, use "hostname" method.¶
-
URI("http://[::1]/bar/baz").host #=> "[::1]"¶
-
URI("http://[::1]/bar/baz").hostname #=> "::1"¶
-
attr_reader :host
-
attr_reader :port
attr_reader :registry
attr_reader :path
@@ -412,6 +435,38 @@
v
end -
extract the host part of the URI and unwrap brackets for IPv6 addresses.¶
-
This method is same as URI::Generic#host except¶
-
brackets for IPv6 (andn future IP) addresses are removed.¶
-
u = URI("http://[::1]/bar")¶
-
p u.hostname #=> "::1"¶
-
p u.host #=> "[::1]"¶
-
def hostname
-
v = self.host
-
/\A\[(.*)\]\z/ =~ v ? $1 : v
-
end
-
set the host part of the URI as the argument with brackets for¶
IPv6 addresses.
-
This method is same as URI::Generic#host= except¶
-
the argument can be bare IPv6 address.¶
-
u = URI("http://foo/bar")¶
-
p u.to_s #=> "http://foo/bar"¶
-
u.hostname = "::1"¶
-
p u.to_s #=> "http://[::1]/bar"¶
-
If the arugument seems IPv6 address,¶
-
it is wrapped by brackets.¶
- def hostname=(v)
-
v = "[#{v}]" if /\A\[.*\]\z/ !~ v && /:/ =~ v
-
self.host = v
- end
- def check_port(v)
return v unless v
--
Tanaka Akira
=end
Updated by shyouhei (Shyouhei Urabe) about 14 years ago
=begin
Hi, why is this issue still opened?
=end
Updated by naruse (Yui NARUSE) about 14 years ago
=begin
I fully agree with akr's last patch, please commit it.
=end
Updated by akr (Akira Tanaka) about 14 years ago
- Status changed from Open to Closed
- % Done changed from 0 to 100
=begin
This issue was solved with changeset r29416.
Adam, thank you for reporting this issue.
Your contribution to Ruby is greatly appreciated.
May Ruby be with you.
=end