Actions
Bug #7501
closed\w in a regular expression doesn't match international characters
Status:
Rejected
Assignee:
-
Target version:
-
ruby -v:
ruby 1.9.3p0 (2011-10-30 revision 33570) [i686-linux]
Backport:
Description
When using regexp matching, \w doesn't match characters which are not in the English alphabet.
For example, the characters "žščřďťňaáéíóůúý" should all be matched by \w but aren't.
This program demonstrates the bug:
encoding: utf-8¶
match = /\w+/.match( "abcdefghijklmnopqrstuvwxyz" )
puts match.to_s
match = /\w+/.match( "áéíóůúýžščřďťň" ) #some Czech characters
puts match.to_s
match = /\w+/.match( "üäö" ) #some German characters
puts match.to_s
Expected output:¶
abcdefghijklmnopqrstuvwxyz
áéíóůúýžščřďťň
üäö
Actual output:¶
abcdefghijklmnopqrstuvwxyz
Actions
Like0
Like0Like0