Project

General

Profile

Actions

Bug #16099

closed

UTF-16LE BOM followed by '\0' is missed

Added by nobu (Nobuyoshi Nakada) almost 6 years ago. Updated almost 6 years ago.

Status:
Closed
Target version:
-
ruby -v:
[ruby-core:94326]

Description

$ ruby -e 'File.binwrite("u.txt", "\xff\xfe\x00\x01")'
$ file u.txt 
u.txt: Little-endian UTF-16 Unicode text, with no line terminators
$ ruby -e 'p File.open("u.txt", "rb:bom|utf-8", &:external_encoding)'
#<Encoding:UTF-8>

The last result must be UTF-16LE.

Actions #1

Updated by nobu (Nobuyoshi Nakada) almost 6 years ago

  • Status changed from Assigned to Closed

Applied in changeset git|5b1bf8dd2d08ae7371ecf025967376bb794ed651.


UTF LE is fixed at least the first 2 bytes

  • io.c (io_strip_bom): if the first 2 bytes are 0xFF0xFE, it
    should be a little-endian UTF, 16 or 32. [Bug #16099]

Updated by nagachika (Tomoyuki Chikanaga) almost 6 years ago

  • Backport changed from 2.5: REQUIRED, 2.6: REQUIRED to 2.5: REQUIRED, 2.6: DONE

ruby_2_6 r67746 merged revision(s) 5b1bf8dd2d08ae7371ecf025967376bb794ed651.

Updated by usa (Usaku NAKAMURA) almost 6 years ago

  • Backport changed from 2.5: REQUIRED, 2.6: DONE to 2.5: DONE, 2.6: DONE

ruby_2_5 r67772 merged revision(s) 5b1bf8dd2d08ae7371ecf025967376bb794ed651.

Actions

Also available in: Atom PDF

Like0
Like0Like0Like0