Project

General

Profile

Bug #13171

URI::FTP path has a trailing slash when just hostname and scheme provided

Added by miloprice (Milo Price) about 2 years ago. Updated about 2 years ago.

Status:
Open
Priority:
Normal
Assignee:
-
Target version:
-
[ruby-core:79315]

Description

As with HTTP uris, the trailing slash on a FTP uri should be optional, per RFC 1738 (ftp://ftp.funet.fi/pub/doc/rfc/rfc1738.txt). However, under the current implementation, URI::FTP#to_s always has a trailing slash when only a hostname is provided (i.e., no path):

URI.parse("http://example.com").to_s
=> "http://example.com"
URI.parse("ftp://example.com").to_s
=> "ftp://example.com/"

History

Updated by phluid61 (Matthew Kerwin) about 2 years ago

Comments about RFC 1738 being "obsolete" notwithstanding, the slash between host/port and url-path is optional when the url-path is empty. Specifically, there is no prohibition on including it.

Section 3.1 ("Common Internet Scheme Syntax"):

Some or all of the parts "<user>:<password>@", ":<password>",
":<port>", and "/<url-path>" may be excluded.

and:

url-path

The rest of the locator consists of data specific to the
scheme, and is known as the "url-path". It supplies the
details of how the specified resource can be accessed. Note
that the "/" between the host (or port) and the url-path is
NOT part of the url-path.

Section 3.2.2. ("FTP url-path"):

The url-path of a FTP URL has the following syntax:

<cwd1>/<cwd2>/.../<cwdN>/<name>;type=<typecode>

So according to the text the slash is purely presentational and has no bearing on the path of the URL. This is reflected in the documentation for URI::FTP#path

Also please note the specification of how to interpret the path elements after that initial slash: https://tools.ietf.org/html/rfc1738#section-3.2.2

Cheers

Updated by shevegen (Robert A. Heiler) about 2 years ago

Guess the behaviour is then correct because it is specified.

But I still consider it unintuitive, in particular in regards to ruby.

Should .to_s ever change the representation or given input to tokens that were not part of the original input?

Because the '/' was not part of it.

http://ruby-doc.org/stdlib-2.4.0/libdoc/uri/rdoc/URI/FTP.html#method-i-to_s states:

"Returns a String representation of the URI::FTP"

Anyway, I guess it is indeed the correct behaviour even if it seems weird.

Updated by phluid61 (Matthew Kerwin) about 2 years ago

If you take the URI object as a data structure with components then any stringification that round-trips through parsing is fine. This is true of any normalisation or canonicalisation.

It's only an issue if you think of the URI as a string.

Updated by ebarendt (Eric Barendt) about 2 years ago

Robert A. Heiler wrote:

Should .to_s ever change the representation or given input to tokens that were not part of the original input?
Because the '/' was not part of it.
Anyway, I guess it is indeed the correct behaviour even if it seems weird.

I disagree that it's correct. But it's also inconsistent with HTTP. Further, where does the "/" come from anyway?

>> URI.parse("http://example.com").path
=> ""
>> URI.parse("ftp://example.com").path
=> ""

Updated by phluid61 (Matthew Kerwin) about 2 years ago

Eric Barendt wrote:

Further, where does the "/" come from anyway?

It's in #to_s

URI.parse("ftp://example.com").to_s
#=> "ftp://example.com/"

i.e. it's presentational, and doesn't affect the semantic meaning.

Updated by phluid61 (Matthew Kerwin) about 2 years ago

Eric Barendt wrote:

I disagree that it's correct.

How so? If we can identify the precise issue, it could be useful as a seed for updating the code and/or the specs.

But it's also inconsistent with HTTP.

That's to be expected; the HTTP URI scheme is defined in up-to-date specifications (work back through the references starting with RFC 7230).

The FTP scheme lives in an ancient, officially "obsolete" specification (RFC 1738), which pre-dates even the generic syntax of RFC 3986. So inconsistencies are to be expected, even if unwanted.

I've just spent four years or so trying to update the FILE scheme from RFC1738 to a more modern context; perhaps someone could do the same for ftp. Or you could forego the IETF and instead work against the WHATWG's URL spec which mentions ftp:// URLs, even if it doesn't describe how to actually use them. (You still need RFC1738 for that, alas.)

Updated by naruse (Yui NARUSE) about 2 years ago

Eric Barendt wrote:

I disagree that it's correct. But it's also inconsistent with HTTP. Further, where does the "/" come from anyway?

>> URI.parse("http://example.com").path
=> ""
>> URI.parse("ftp://example.com").path
=> ""

About instance variables,

irb(main):005:0> URI.parse("ftp://example.com").instance_variable_get(:@path)
=> "/"
irb(main):006:0> URI.parse("http://example.com").instance_variable_get(:@path)
=> ""

Maybe they should be

  • @path should be "" or nil
  • uri.path should be '/'
  • uri.to_s should be without trailing '/'

Though such inconsistency can be acceptable because the semantics of path is much different between HTTP and FTP.

Updated by naruse (Yui NARUSE) about 2 years ago

Matthew Kerwin wrote:

Eric Barendt wrote:

I disagree that it's correct.

How so? If we can identify the precise issue, it could be useful as a seed for updating the code and/or the specs.

But it's also inconsistent with HTTP.

That's to be expected; the HTTP URI scheme is defined in up-to-date specifications (work back through the references starting with RFC 7230).

The FTP scheme lives in an ancient, officially "obsolete" specification (RFC 1738), which pre-dates even the generic syntax of RFC 3986. So inconsistencies are to be expected, even if unwanted.

I've just spent four years or so trying to update the FILE scheme from RFC1738 to a more modern context; perhaps someone could do the same for ftp. Or you could forego the IETF and instead work against the WHATWG's URL spec which mentions ftp:// URLs, even if it doesn't describe how to actually use them. (You still need RFC1738 for that, alas.)

https://tools.ietf.org/html/draft-yevstifeyev-ftp-uri-scheme-08 is a draft.
https://bugs.ruby-lang.org/issues/7310#note-6 though some edge cases are still different...

Also available in: Atom PDF