Bug #5831: URI.extract not properly extracting URIs with trailing slash followed by single quote - Ruby - Ruby Issue Tracking System

Actions

Copy link

Bug #5831

closed

URI.extract not properly extracting URIs with trailing slash followed by single quote

Bug #5831: URI.extract not properly extracting URIs with trailing slash followed by single quote

Added by bcardarella (Brian Cardarella) over 14 years ago. Updated over 14 years ago.

Status:

Rejected

Assignee:

Target version:

1.9.2

ruby -v:

1.9.2-p290

Backport:

[ruby-core:<unknown>]

Description

I have example failing test cases here:

https://gist.github.com/1547904

Here is my use case. I am looking to extract URIs from emails. It has been recommended to use Nokogiri and that is just fine if the email is in HTML. But if the email is in plain-text Nokogiri doesn't work. IMO this is a bug with URI.extract's regexp.

I have tested this against 1.8.7, 1.9.2, and 1.9.3 and it exists in all three.

Updated by xds2000 (deshi xiao) over 14 years ago Actions
Copy link
#1 [ruby-core:42245]

I have reading lib/uri/common.rb, I found the URI.extract's behavior is split url with whitespace. so i think you report is not bug. here is clue,please have a look.

# Constructs the default Hash of Regexp's

500 def initialize_regexp(pattern)
501 ret = {}
502
503 # for URI::split
504 ret[:ABS_URI] = Regexp.new('\A\s*' + pattern[:X_ABS_URI] + '\s*\z', Regexp::EXTENDED)
505 ret[:REL_URI] = Regexp.new('\A\s*' + pattern[:X_REL_URI] + '\s*\z', Regexp::EXTENDED)

Updated by naruse (Yui NARUSE) over 14 years ago Actions
Copy link
#2 [ruby-core:42255]

Status changed from Open to Rejected

Sorry for late reply.

As deshi says, that's not a bug, it's a feature.

Actions

Copy link

Also available in: PDF Atom

Project

General

Profile

Ruby

Custom queries

Bug #5831

URI.extract not properly extracting URIs with trailing slash followed by single quote

Updated by xds2000 (deshi xiao) over 14 years ago Actions
Copy link
#1 [ruby-core:42245]

Updated by naruse (Yui NARUSE) over 14 years ago Actions
Copy link
#2 [ruby-core:42255]

Project

General

Profile

Ruby

Custom queries

Bug #5831

URI.extract not properly extracting URIs with trailing slash followed by single quote

Updated by xds2000 (deshi xiao) over 14 years ago ActionsCopy link #1 [ruby-core:42245]

Updated by naruse (Yui NARUSE) over 14 years ago ActionsCopy link #2 [ruby-core:42255]

Updated by xds2000 (deshi xiao) over 14 years ago Actions
Copy link
#1 [ruby-core:42245]

Updated by naruse (Yui NARUSE) over 14 years ago Actions
Copy link
#2 [ruby-core:42255]