Project

General

Profile

Feature #14802

Update Unicode data to Unicode Version 11.0.0

Added by duerst (Martin Dürst) 12 months ago. Updated 5 months ago.

Status:
Closed
Priority:
Normal
Target version:
[ruby-core:87335]

Description

Unicode Version 11.0.0 will be published sometimes later this year, probably in late June. This is an issue to manage updating Ruby to Unicode 11.0.0. Details to follow.


Related issues

Related to Ruby trunk - Feature #13685: Update Unicode data to Unicode Version 10.0.0ClosedActions
Blocked by Ruby trunk - Feature #14839: How to deal with capitalizing Georgian in Unicode 11.0.0ClosedActions
Blocked by Ruby trunk - Feature #15182: Update extended grapheme cluster implementation for Unicode 11ClosedActions
Blocked by Ruby trunk - Feature #15317: How to deal with obsolete property values in Unicode 11.0.0ClosedActions
Blocks Ruby trunk - Feature #15321: Update Unicode data to Unicode Version 12.0.0ClosedActions
Blocked by Ruby trunk - Bug #15337: String#each_grapheme_cluster wrongly splits "\r\n"ClosedActions

Associated revisions

Revision c2d8078e
Added by duerst (Martin Dürst) 5 months ago

delete Unicode 10.0.0 related files, no longer needed [#14802]
This line, and those below, will be ignored--

D enc/unicode/10.0.0

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66295 b2dd03c8-39d4-4d8f-98ff-823fe69b080e

Revision 66295
Added by duerst (Martin Dürst) 5 months ago

delete Unicode 10.0.0 related files, no longer needed [#14802]
This line, and those below, will be ignored--

D enc/unicode/10.0.0

Revision 66295
Added by duerst (Martin Dürst) 5 months ago

delete Unicode 10.0.0 related files, no longer needed [#14802]
This line, and those below, will be ignored--

D enc/unicode/10.0.0

History

#1

Updated by duerst (Martin Dürst) 12 months ago

  • Related to Feature #13685: Update Unicode data to Unicode Version 10.0.0 added

Updated by shevegen (Robert A. Heiler) 12 months ago

All power to the emoji. \o/

Updated by duerst (Martin Dürst) 12 months ago

Unicode Version 11.0.0 has been published, the official announcement can be found at http://blog.unicode.org/2018/06/announcing-unicode-standard-version-110.html.

#4

Updated by duerst (Martin Dürst) 12 months ago

  • Blocked by Feature #14839: How to deal with capitalizing Georgian in Unicode 11.0.0 added

Updated by naruse (Yui NARUSE) 8 months ago

Just a note, the definition of extended grapheme cluster is changed in Unicode 11 (Unicode® Standard Annex #29
UNICODE TEXT SEGMENTATION revision 33: https://www.unicode.org/reports/tr29/tr29-33.html)
This affects Regexp /\X/ which is hardcoded in node_extended_grapheme_cluster() in regparse.c.

( CRLF
| Prepend*
( RI-sequence | Hangul-Syllable | !Control )
( Grapheme_Extend | SpacingMark )*
| . )
crlf 
| Control 
| precore* core postcore*
#6

Updated by duerst (Martin Dürst) 8 months ago

  • Blocked by Feature #15182: Update extended grapheme cluster implementation for Unicode 11 added

Updated by duerst (Martin Dürst) 8 months ago

naruse (Yui NARUSE) wrote:

Just a note, the definition of extended grapheme cluster is changed in Unicode 11

This is mentioned at http://www.unicode.org/versions/Unicode11.0.0/, so I was (vaguely) aware of it, but thanks for the reminder. I have created a subissue at #15182. I may have to get back to you for some help, but first I have to fight with #14802 :-(.

#8

Updated by duerst (Martin Dürst) 6 months ago

  • Blocked by Feature #15317: How to deal with obsolete property values in Unicode 11.0.0 added
#9

Updated by duerst (Martin Dürst) 6 months ago

  • Blocks Feature #15321: Update Unicode data to Unicode Version 12.0.0 added
#10

Updated by duerst (Martin Dürst) 6 months ago

  • Blocked by Bug #15337: String#each_grapheme_cluster wrongly splits "\r\n" added

Updated by duerst (Martin Dürst) 5 months ago

  • Status changed from Open to Closed

Some hints for future Unicode updates:

  • Check early whether modification to algorithms,... are necessary.

  • For tests, these are the main ones:
    test/test_unicode_normalize.rb
    test/ruby/enc
    test/ruby/test_m17n*
    test/ruby/test_regexp.rb
    test/ruby/test_string*

  • There are also some specs involved, so make sure to check them, too.

Also available in: Atom PDF