Project

General

Profile

Actions

Bug #21503

open

\p{Word} does not match on \p{Join_Control} while docs say it does

Added by procmarco (Marco Concetto Rudilosso) about 19 hours ago. Updated about 11 hours ago.

Status:
Open
Assignee:
-
Target version:
-
[ruby-core:122665]

Description

in the docs it is mentioned that \p{Word} matches the equivalent of: [\p{M}\p{Nd}\p{Pc}\p{Alpha}\p{Join_Control}] as it's also defined in the unicode spec

the issue is that it does not seem to be the case

irb(main):018> REGEX = /\p{Word}/u
=> /\p{Word}/
irb(main):019> "\u200D".gsub(REGEX, "-")
=> "‍"
irb(main):020> REGEX2 = /\p{Join_Control}/u
=> /\p{Join_Control}/
irb(main):021> "\u200D".gsub(REGEX2, "-")
=> "-"

There's 2 solutions here, either we change the docs or the code.


Related issues 1 (0 open1 closed)

Related to Ruby - Bug #19417: Regexp \p{Word} and [[:word:]] do not match Unicode Other_Number characterClosedActions

Updated by procmarco (Marco Concetto Rudilosso) about 19 hours ago

What I mean is that the current implementation of \p{Word} does not seem to match \p{Join_Control} even though it should and it also says so in the docs

Actions #2

Updated by mame (Yusuke Endoh) about 11 hours ago

  • Related to Bug #19417: Regexp \p{Word} and [[:word:]] do not match Unicode Other_Number character added
Actions

Also available in: Atom PDF

Like0
Like0Like0Like0