Project

General

Profile

Feature #12886

URI#merge doesn't handle paths correctly

Added by ioquatix (Samuel Williams) 5 months ago. Updated 5 months ago.

Status:
Open
Priority:
Normal
Assignee:
-
Target version:
-
[ruby-core:77819]

Description

I feel like this should work.

> URI.parse("/base/uri") + URI.parse("relative")
URI::BadURIError: both URI are relative

The result should be URI with path = "/base/relative".

But it doesn't. It fails with an exception.

There are two ways to fix this. The first is to change the meaning of URI#absolute? to relate to the absoluteness of the path, not whether or not there is a scheme.

The second way to fix this is to directly work around the issue in merge.

In my opinion

> URI.parse("a/b") + URI.parse("c")
URI::BadURIError: both URI are relative

should also work, with a result of "a/c".

The need for the LHS of the operation to contain a scheme is not a useful requirement in practice, and in addition, I'd like to state that URI("a/c") is actually a valid URI. So, it's purely the merge function being to limited in what it will handle for no obvious reason.

Situations where this comes up: parsing a website which contains relative URLS, and you want to construct absolute URLs.

History

#1 [ruby-core:77821] Updated by phluid61 (Matthew Kerwin) 5 months ago

This ticket should be re-cast as a feature request, to allow merging of two relative references.

Incidentally:

Samuel Williams wrote:

[...] I'd like to state that URI("a/c") is actually a valid URI.

As discussed elsewhere, it's a "relative reference", not a "URI." It parses to a URI::Generic object for pragmatic reasons.

So, it's purely the merge function being to limited in what it will handle for no obvious reason.

Except for Internet Standard 66 (RFC 3986), section 5.1 of which says "The term "relative" implies that a "base URI" exists against which the relative reference is applied. Aside from fragment-only references, relative references are only usable when a base URI is known."

You have two relative references, not a (base) URI and a relative reference.

Situations where this comes up: parsing a website which contains relative URLS, and you want to construct absolute URLs.

Except that this doesn't construct absolute URLs, it constructs different relative references.

#2 [ruby-core:77831] Updated by nobu (Nobuyoshi Nakada) 5 months ago

  • Tracker changed from Bug to Feature

#3 [ruby-core:77892] Updated by duerst (Martin Dürst) 5 months ago

Samuel Williams wrote:

I feel like this should work.

Feelings are not enough. As Matthew already said, https://tools.ietf.org/html/rfc3986.html#section-5.2 doesn't define this. I can see at least the following problems:

1) Based on experience with RFC 3986 and it's predecessors, there would be a lot of details to get right, where often the definition of 'right' is quite unclear.

2) Not all URI schemes that may contain '/' allow or define relative processing. As an example, mailto: allows '/' in some places, but doesn't do relative processing (except that you can combine something without the scheme name with a base that has the scheme). Because when you have "/base/uri" and "relative", you don't know what the scheme is/will be, combining them would have to assume too much.

3) The fact that it's not defined in RFC 3986 may be a strong indication that it's not an operation that can be assumed to work.

4) I wonder whether there are any other languages/libraries that implement anything like the operation you propose. Of course, there are libraries working on file system paths that will do this kind of operation, but these don't exactly count.

Also available in: Atom PDF