Feature #18822
closedRuby lack a proper method to percent-encode strings for URIs (RFC 3986)
Description
Context¶
There are two fairly similar encoding methods that are often confused.
application/x-www-form-urlencoded which is how form data is encoded, and "percent-encoding" as defined by RFC 3986.
AFAIK, the only way they differ is that "form encoding" escape space characters as +, and RFC 3986 escape them as %20. Most of the time it doesn't matter, but sometimes it does.
Ruby form and URL escape methods¶
-
URI.escape(" ") # => "%20"but it was deprecated and removed (in 3.0 ?). -
ERB::Util.url_encode(" ") # => "%20"but it's implemented with agsuband isn't very performant. It's also awkward to have to reach forERB CGI.escape(" ") # => "+"URI.encode_www_form_component(" ") # => "+"
Unescape methods¶
For unescaping, it's even more of a clear cut since URI.unescape was removed. So there's no available method that won't treat an unescaped + as simply +.
e.g. in Javascript: decodeURIComponent("foo+bar") #=> "foo+bar".
If one were to use CGI.unescape, the string might be improperly decoded: GI.unescape("foo+bar") #=> "foo bar".
Other languages¶
- Javascript
encodeURIandencodeURIComponentuse%20. - PHP has
urlencodeusing+andrawurlencodeusing%20. - Python has
urllib.parse.quoteusing%20andurllib.parse.quote_plususing+.
Proposal¶
Since CGI already have a very performant encoder for application/x-www-form-urlencoded, I think it would make sense that it would provide another method for RFC3986.
I propose:
CGI.url_encode(" ") # => "%20"- Or
CGI.encode_url. - Alias
CGI.escapeasGCI.encode_www_form_component - Clarify the documentation of
CGI.escape.