Bug #3217
Regexp fails to match string with '<' when encoding is UTF-8
| Status: | Rejected | Start date: | 04/29/2010 | |
|---|---|---|---|---|
| Priority: | Normal | Due date: | ||
| Assignee: | % Done: | 0% |
||
| Category: | M17N | |||
| Target version: | 1.9.2 | |||
| ruby -v: | ruby 1.9.2dev (2010-04-28 trunk 27536) [i386-darwin9.8.0] |
Description
Hi, There is an issue matching a string like "a *b* c *d*<" when the encoding of the file is set to UTF-8 and the regexp is attempting to match '*something*'. Afaik, *< is not special in the encoding. This gist illustrates the issue: http://gist.github.com/382510 Thanks, Brian
Related issues
History
Updated by naruse (Yui NARUSE) about 2 years ago
- Status changed from Open to Rejected
'<' is not Punctuation on Unicode; it is Math_Symbol. http://unicode.org/Public/UNIDATA/extracted/DerivedGeneralCategory.txt
Updated by naruse (Yui NARUSE) about 2 years ago
- Status changed from Rejected to Assigned
- Assignee set to naruse (Yui NARUSE)
Oops, I missed this. I'll fix.
Updated by naruse (Yui NARUSE) about 2 years ago
- Category set to M17N
- Status changed from Assigned to Rejected
This is feature change on Ruby 1.9. http://www.unicode.org/reports/tr18/ And redcloth3's exapmle is a bug, they should use their PUNCT constant.