Project

General

Profile

Bug #2822

Russian characters are missing from word characters types in Regexp

Added by stas (Stas Senotrusov) almost 10 years ago. Updated over 8 years ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Target version:
ruby -v:
ruby 1.9.2dev (2010-02-27 trunk 26772) [i686-linux]
Backport:
[ruby-core:28354]

Description

=begin
"Hello".match(/[\w]*/)
=> #

"Привет".match(/[\w]*/)
=> #

"Привет".match(/[А-Яа-яЁё\w]*/)
=> #

Non word character type \W behaves similar.
=end

History

#1

Updated by Eregon (Benoit Daloze) almost 10 years ago

=begin
$ ri Regexp
/\w/ - A word character ([a-zA-Z0-9_])

/:word:/ - A character in one of the following Unicode
general categories Letter, Mark, Number,
Connector_Punctuation

/\p{Word}/ - A member of one of the following Unicode general
category Letter, Mark, Number, Connector_Punctuation

"aér".match /\w+/
=> #
"aér".match /:word:+/
=> #
"aér".match /\p{Word}+/
=> #

The documentation of Regexp is awesome in Ruby 1.9, have a look ;)
=end

#2

Updated by naruse (Yui NARUSE) almost 10 years ago

  • Status changed from Open to Closed

=begin

=end

Also available in: Atom PDF