Project

General

Profile

Actions

Feature #15182

closed

Update extended grapheme cluster implementation for Unicode 11

Added by duerst (Martin Dürst) almost 3 years ago. Updated almost 3 years ago.

Status:
Closed
Priority:
Normal
Target version:
[ruby-core:89224]

Description

Reported by naruse (Yui NARUSE) at https://bugs.ruby-lang.org/issues/14802#change-74213:

The definition of extended grapheme cluster is changed in Unicode 11 (Unicode® Standard Annex #29
UNICODE TEXT SEGMENTATION revision 33: https://www.unicode.org/reports/tr29/tr29-33.html)
This affects Regexp /\X/ which is hardcoded in node_extended_grapheme_cluster() in regparse.c.

( CRLF
| Prepend*
( RI-sequence | Hangul-Syllable | !Control )
( Grapheme_Extend | SpacingMark )*
| . )
crlf 
| Control 
| precore* core postcore*

Related issues

Blocks Ruby master - Feature #14802: Update Unicode data to Unicode Version 11.0.0Closedduerst (Martin Dürst)Actions
Blocked by Ruby master - Feature #15341: Provide emoji version as RbConfig::CONFIG['UNICODE_EMOJI_VERSION']Closedmatz (Yukihiro Matsumoto)Actions
Blocked by Ruby master - Bug #15343: String#each_grapheme_cluster wrongly splits some emoji (genie, zombie, wrestling)Closedduerst (Martin Dürst)Actions
Actions

Also available in: Atom PDF