Project

General

Profile

Actions

Bug #19532

closed

Handling of 6-byte codepoints in left_adjust_char_head in CESU-8 encoding is broken

Added by Eregon (Benoit Daloze) about 1 year ago. Updated 9 months ago.

Status:
Closed
Assignee:
-
Target version:
-
ruby -v:
ruby 3.2.1 (2023-02-08 revision 31819e82c8) [x86_64-linux]
[ruby-core:112918]

Description

irb(main):001:0> "\u{10000}".encode("cesu-8").chop
=> "\xED\xA0\x80"

But it should be "".

Fix in https://github.com/ruby/ruby/pull/7510

Actions #1

Updated by Eregon (Benoit Daloze) about 1 year ago

  • Description updated (diff)
Actions #2

Updated by Eregon (Benoit Daloze) about 1 year ago

  • ruby -v set to ruby 3.2.1 (2023-02-08 revision 31819e82c8) [x86_64-linux]

Updated by nobu (Nobuyoshi Nakada) about 1 year ago

The change itself looks fine, but also I’m for obsoleting CESU-8.

Updated by usa (Usaku NAKAMURA) 10 months ago

  • Backport changed from 2.7: UNKNOWN, 3.0: REQUIRED, 3.1: REQUIRED, 3.2: REQUIRED to 2.7: UNKNOWN, 3.0: REQUIRED, 3.1: DONE, 3.2: REQUIRED

ruby_3_1 0275614ba213dfb6f05743a16f65623bc3b6e274 merged revision(s) 2c8f287.

Actions #5

Updated by Anonymous 10 months ago

  • Status changed from Open to Closed

Applied in changeset git|0275614ba213dfb6f05743a16f65623bc3b6e274.


merge revision(s) 2c8f287: [Backport #19532]

    Fix handling of 6-byte codepoints in left_adjust_char_head in CESU-8
     encoding

    ---
     enc/cesu_8.c                | 23 +++++++++++++++++++----
     test/ruby/enc/test_cesu8.rb |  4 ++++
     2 files changed, 23 insertions(+), 4 deletions(-)

Updated by nagachika (Tomoyuki Chikanaga) 9 months ago

  • Backport changed from 2.7: UNKNOWN, 3.0: REQUIRED, 3.1: DONE, 3.2: REQUIRED to 2.7: UNKNOWN, 3.0: REQUIRED, 3.1: DONE, 3.2: DONE

ruby_3_2 4e0653db3315e9e7859e38e0995e2b9900471370 merged revision(s) 2c8f2871a8aeff592369a993b1d69557160cfa61.

Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0Like0Like0