Bug #18028


test/ruby/enc/test_emoji_breaks.rb does not deal with Unicode ranges in file emoji-sequences.txt

Added by duerst (Martin Dürst) 3 months ago. Updated about 2 months ago.

Target version:
ruby -v:
ruby 3.1.0dev (2021-06-03T06:59:33Z master 7e14762159) [x86_64-linux]


While working on issue #17750, I found out that test_emoji_breaks.rb does not deal with Unicode ranges in the file emoji-sequences.txt. That means that the tests may not cover all emoji. This should eventually be fixed, but requires some rewriting of the code, which I plan to do independently of the Unicode/Emoji version upgrade.

Related issues

Related to Ruby master - Feature #17750: Update Unicode data to Unicode Version 13.0.0Closedduerst (Martin Dürst)Actions
Actions #1

Updated by duerst (Martin Dürst) 3 months ago

  • Related to Feature #17750: Update Unicode data to Unicode Version 13.0.0 added

Updated by duerst (Martin Dürst) 2 months ago

One of the testing scripts (test/ruby/enc/test_emoji_break.rb) that the version declared internally in a data files matches the version we expect. In that context, I ran into the following problem, reported via standard channels to the Unicode Consortium:

Emoji data files in internally say they are for version 13.1. But the files moved to, say "# Version: 13.0". We keep both an Unicode version and an Emoji version (available in Ruby via RbConfig::CONFIG['UNICODE_VERSION'] and RbConfig::CONFIG['UNICODE_EMOJI_VERSION']). But neither of them matches 13.0. For the files moved under, they really should indicate the Unicode version, not the Emoji version, because they are updated in sync with Unicode versions, and not updated when only Emoji versions get updated.

As a temporary measure, I plan to ignore the version in the moved file(s).

Updated by duerst (Martin Dürst) about 2 months ago

  • Status changed from Open to Closed

Completed with commit 26b1e6fca8.


Also available in: Atom PDF