Bug #11859: Regexp matching with \p{Upper} and \p{Lower} for EUC-JP doesn’t work. - Ruby - Ruby Issue Tracking System

Actions

Copy link

Bug #11859

closed

Regexp matching with \p{Upper} and \p{Lower} for EUC-JP doesn’t work.

Bug #11859: Regexp matching with \p{Upper} and \p{Lower} for EUC-JP doesn’t work.

Added by matsui (Kimihito Matsui) about 10 years ago. Updated over 9 years ago.

Status:

Rejected

Assignee:

Target version:

ruby -v:

ruby 2.2.2p95 (2015-04-13 revision 50295) [x86_64-darwin14]

Backport:

2.0.0: UNKNOWN, 2.1: UNKNOWN, 2.2: UNKNOWN

[ruby-dev:49454]

Description

U+FF21 (Ａ, FULLWIDTH LATIN CAPITAL LETTER A) and U+00c0 (À, LATIN CAPITAL LETTER A WITH GRAVE) is Uppercase_Letter so it should match and return 0 in following case but this returns 1.

ruby -e 'puts "\uFF21A".encode("EUC-JP") =~ Regexp.compile("\\\p{Upper}".encode("EUC-JP”))' # => 1
ruby -e 'puts "\u00C0A".encode("EUC-JP") =~ Regexp.compile("\\\p{Upper}".encode("EUC-JP"))’ # => 1

This also happens in lower case matching.

ruby -e 'puts "\uFF41a".encode("EUC-JP") =~ Regexp.compile("\\\p{Lower}".encode("EUC-JP"))’ ＃=> 1

In Unicode encoding it works as follows.

ruby -e 'puts "\uFF21A" =~ Regexp.compile("\\\p{Upper}")'  # => 0

Looks like EUC-JP \p{Upper} and \p{Lower} regex is limited to ASCII characters.

Related issues 1 (0 open — 1 closed)

Actions

Copy link

Also available in: PDF Atom

Project

General

Profile

Ruby

Custom queries

Bug #11859

Regexp matching with \p{Upper} and \p{Lower} for EUC-JP doesn’t work.

Updated by matsui (Kimihito Matsui) about 10 years ago Actions
Copy link
#1 [ruby-dev:49455]

Updated by matsui (Kimihito Matsui) about 10 years ago Actions
Copy link
#2 [ruby-dev:49456]

Updated by naruse (Yui NARUSE) over 9 years ago Actions
Copy link
#3 [ruby-dev:49663]

Updated by duerst (Martin Dürst) over 9 years ago Actions
Copy link
#4 [ruby-dev:49664]

Updated by duerst (Martin Dürst) over 8 years ago Actions
Copy link
#5

Project

General

Profile

Ruby

Custom queries

Bug #11859

Regexp matching with \p{Upper} and \p{Lower} for EUC-JP doesn’t work.

Updated by matsui (Kimihito Matsui) about 10 years ago ActionsCopy link #1 [ruby-dev:49455]

Updated by matsui (Kimihito Matsui) about 10 years ago ActionsCopy link #2 [ruby-dev:49456]

Updated by naruse (Yui NARUSE) over 9 years ago ActionsCopy link #3 [ruby-dev:49663]

Updated by duerst (Martin Dürst) over 9 years ago ActionsCopy link #4 [ruby-dev:49664]

Updated by duerst (Martin Dürst) over 8 years ago ActionsCopy link #5

Updated by matsui (Kimihito Matsui) about 10 years ago Actions
Copy link
#1 [ruby-dev:49455]

Updated by matsui (Kimihito Matsui) about 10 years ago Actions
Copy link
#2 [ruby-dev:49456]

Updated by naruse (Yui NARUSE) over 9 years ago Actions
Copy link
#3 [ruby-dev:49663]

Updated by duerst (Martin Dürst) over 9 years ago Actions
Copy link
#4 [ruby-dev:49664]

Updated by duerst (Martin Dürst) over 8 years ago Actions
Copy link
#5