Project

General

Profile

Actions

Bug #19532

open

Handling of 6-byte codepoints in left_adjust_char_head in CESU-8 encoding is broken

Added by Eregon (Benoit Daloze) 3 months ago. Updated 3 months ago.

Status:
Open
Priority:
Normal
Assignee:
-
Target version:
-
ruby -v:
ruby 3.2.1 (2023-02-08 revision 31819e82c8) [x86_64-linux]
[ruby-core:112918]

Description

irb(main):001:0> "\u{10000}".encode("cesu-8").chop
=> "\xED\xA0\x80"

But it should be "".

Fix in https://github.com/ruby/ruby/pull/7510

Actions #1

Updated by Eregon (Benoit Daloze) 3 months ago

  • Description updated (diff)
Actions #2

Updated by Eregon (Benoit Daloze) 3 months ago

  • ruby -v set to ruby 3.2.1 (2023-02-08 revision 31819e82c8) [x86_64-linux]

Updated by nobu (Nobuyoshi Nakada) 3 months ago

The change itself looks fine, but also I’m for obsoleting CESU-8.

Actions

Also available in: Atom PDF

Like0
Like0Like0Like0