Bug #3217: Regexp fails to match string with '<' when encoding is UTF-8 - Ruby master - Ruby Issue Tracking System

Actions

Copy link

Bug #3217

closed

Regexp fails to match string with '<' when encoding is UTF-8

Added by brixen (Brian Shirai) about 14 years ago. Updated about 13 years ago.

Status:

Rejected

Assignee:

naruse (Yui NARUSE)

Target version:

1.9.2

ruby -v:

ruby 1.9.2dev (2010-04-28 trunk 27536) [i386-darwin9.8.0]

Backport:

[ruby-core:29864]

Description

=begin
Hi,

There is an issue matching a string like "a b c d<" when the encoding of the file is set to UTF-8 and the regexp is attempting to match 'something'. Afaik, *< is not special in the encoding.

This gist illustrates the issue:

http://gist.github.com/382510

Thanks,
Brian
=end

Related issues 1 (0 open — 1 closed)

Actions

Copy link

Updated by naruse (Yui NARUSE) about 14 years ago

Status changed from Open to Rejected

=begin
'<' is not Punctuation on Unicode; it is Math_Symbol.
http://unicode.org/Public/UNIDATA/extracted/DerivedGeneralCategory.txt
=end

Actions

Copy link

Updated by naruse (Yui NARUSE) about 14 years ago

Status changed from Rejected to Assigned
Assignee set to naruse (Yui NARUSE)

=begin
Oops, I missed this. I'll fix.
=end

Actions

Copy link

Updated by naruse (Yui NARUSE) about 14 years ago

Category set to M17N
Status changed from Assigned to Rejected

=begin
This is feature change on Ruby 1.9.
http://www.unicode.org/reports/tr18/

And redcloth3's exapmle is a bug, they should use their PUNCT constant.
=end

Actions

Copy link

Also available in: Atom PDF

Like0

Like0Like0Like0

Project

General

Profile

Ruby » Ruby master

Custom queries

Bug #3217

Regexp fails to match string with '<' when encoding is UTF-8

Updated by naruse (Yui NARUSE) about 14 years ago

Updated by naruse (Yui NARUSE) about 14 years ago

Updated by naruse (Yui NARUSE) about 14 years ago