Actions
Bug #21683
openIO#each_codepoint do not take care of encoding when IO uses encoding conversion for reading.
Bug #21683:
IO#each_codepoint do not take care of encoding when IO uses encoding conversion for reading.
Status:
Open
Assignee:
-
Target version:
-
ruby -v:
ruby 3.5.0dev (2025-11-03T10:33:44Z master 0832e954c9) +PRISM [x64-mingw-ucrt]
Description
without encoding conversion
irb(main):001> open(File::NULL, 'r') { |f| f.ungetc(%Q[\u{3042}\u{3044}\u{3046}]); f.each_codepoint.map { |c| c.to_s(16) } }
=> ["3042", "3044", "3046"] # => valid
with encoding conversion
irb(main):001> open(File::NULL, 'rt') { |f| f.ungetc(%Q[\u{3042}\u{3044}\u{3046}]); f.each_codepoint.map { |c| c.to_s(16) } }
=> ["e3", "81", "82", "e3", "81", "84", "e3", "81", "86"] # => invalid
prior to ruby 3.4 lacks 6cd98c24fe9aeea3829ac3d554a277f053cec0be (Allow IO#each_codepoint to work with unetc even when encoding conversion active)
using ungetbyte can similarly reproduce this.
irb(main):001> open(File::NULL, 'rt') { |f| f.ungetbyte(%Q[\u{3042}\u{3044}\u{3046}]); p f.each_codepoint.map { |c| c.to_s(16) } }
=> ["e3", "81", "82", "e3", "81", "84", "e3", "81", "86"]
No data to display
Actions