Project

General

Profile

Bug #17029

URI.parse considers https://example.com/### invalid when browsers consider it valid

Added by nileshtr (Nilesh Trivedi) 2 months ago. Updated 2 months ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Target version:
-
ruby -v:
ruby 2.7.1p83 (2020-03-31 revision a0c7c23c9c) [x86_64-darwin19]
[ruby-core:99153]

Description

I have a form with <input type="url" required> and in the backend, I try to extract the domain with URI.parse(url).host

A user was able to submit a value like https://example.com/### which passed the browser's validation check, but failed by URI.parse with this error:

        3: from /Users/helix/.rbenv/versions/2.7.1/lib/ruby/2.7.0/uri/common.rb:234:in `parse'
        2: from /Users/helix/.rbenv/versions/2.7.1/lib/ruby/2.7.0/uri/rfc3986_parser.rb:73:in `parse'
        1: from /Users/helix/.rbenv/versions/2.7.1/lib/ruby/2.7.0/uri/rfc3986_parser.rb:67:in `split'
URI::InvalidURIError (bad URI(is not URI?): "https://example.com/###")

You can try the browser's behavior at MDN's demo: https://developer.mozilla.org/en-US/docs/Web/HTML/Element/input/url

This is what the MDN page says about validation:

The syntax of a URL is fairly intricate. It's defined by WHATWG's URL Living Standard ( https://url.spec.whatwg.org/ ) and is described for newcomers in our article What is a URL? ( https://developer.mozilla.org/en-US/docs/Learn/Common_questions/What_is_a_URL )

Also available in: Atom PDF