Project

General

Profile

Actions

Bug #18027

closed

test/ruby/enc/test_emoji_breaks.rb does not use the file emoji-variation-sequences.txt

Added by duerst (Martin Dürst) 3 months ago. Updated 2 months ago.

Status:
Closed
Priority:
Normal
Target version:
ruby -v:
ruby 3.1.0dev (2021-06-03T06:59:33Z master 7e14762159) [x86_64-linux]
[ruby-core:104539]

Description

While working on issue #17750, I found out that the number of assertions is essentially the same whether or not the file emoji-variation-sequences.txt is included in the files used to create the test data. This is the case both before and after the move of this file. So it seems unrelated to the move of the file, or to the version upgrade.

I have not yet been able to figure out a reason for this behavior; the structure of the file emoji-variation-sequences.txt looks just like the structure of the other three files (emoji-sequences.txt, emoji-test.txt, and emoji-zwj-sequences.txt).

This bug serves to track this issue. Any hints appreciated. In particular, I'm a puts debugger, but puts debugging doesn't seem to work well when testing.


Related issues

Related to Ruby master - Feature #17750: Update Unicode data to Unicode Version 13.0.0Closedduerst (Martin Dürst)Actions
Actions #1

Updated by duerst (Martin Dürst) 3 months ago

  • Related to Feature #17750: Update Unicode data to Unicode Version 13.0.0 added

Updated by duerst (Martin Dürst) 2 months ago

duerst (Martin Dürst) wrote:

In particular, I'm a puts debugger, but puts debugging doesn't seem to work well when testing.

I found that adding and using the following method in the Test... class allowed me to use log_test instead of puts for debugging. If you know of a better way, please tell me.

  def log_test(message)
    open('log_test.txt', 'a') { |f| f.write(message+"\n") }
  end

Updated by duerst (Martin Dürst) 2 months ago

  • Status changed from Open to Closed

Resolved with commit fd7f61c.

It turned out the problem was that in emoji-variation-sequences.txt, lines look like
0023 FE0E ; text style ; # (1.1) NUMBER SIGN
whereas in the other two files that we processed the same way, they look like
23F0 ; Basic_Emoji; alarm clock # E0.6 [1] (⏰)
(aligned to show the similarities and differences). The empty field between the second semicolon and the hash mark wasn't properly accounted for, but this has now been fixed in commit fd7f61c.

Actions

Also available in: Atom PDF