Bug #9415

Strings#codepoints doesn't respect BOM on UTF-{16,32} pseudo encodings

Added by Nobuyoshi Nakada over 1 year ago. Updated about 1 year ago.

[ruby-dev:<unknown>]
Status:Closed
Priority:Normal
Assignee:Yui NARUSE
ruby -v:- Backport:1.9.3: REQUIRED, 2.0.0: REQUIRED, 2.1: DONE

Description

String#codepointsUTF-16UTF-32でのBOMを考慮していません。

$ ruby -e 'puts "%x" % "\u{feff}".encode("UTF-16BE").force_encoding("UTF-16").codepoints'
feff
$ ruby -e 'puts "%x" % "\u{feff}".encode("UTF-16LE").force_encoding("UTF-16").codepoints'
fffe

String#ordなども同様です。

$ ruby -e 'printf "%x\n", "\u{feff}".encode("UTF-16BE").force_encoding("UTF-16").ord'
feff
$ ruby -e 'printf "%x\n", "\u{feff}".encode("UTF-16LE").force_encoding("UTF-16").ord'
fffe

Related issues

Related to Ruby trunk - Bug #8940: printing UTF-32 crashs ruby Closed 09/23/2013

Associated revisions

Revision 44606
Added by Nobuyoshi Nakada over 1 year ago

string.c: respect BOM

  • string.c (get_encoding): respect BOM on pseudo encodings. [Bug #9415]

Revision 44606
Added by Nobuyoshi Nakada over 1 year ago

string.c: respect BOM

  • string.c (get_encoding): respect BOM on pseudo encodings. [Bug #9415]

History

#1 Updated by Nobuyoshi Nakada over 1 year ago

  • ruby -v changed from r44601 to -

チケット #9415 が Nobuyoshi Nakada によって報告されました。


Bug #9415: Strings#codepoints doesn't respect BOM on UTF-{16,32} pseudo encodings
https://bugs.ruby-lang.org/issues/9415

  • 作成者: Nobuyoshi Nakada
  • ステータス: Open
  • 優先度: Normal
  • 担当者: Yui NARUSE
  • カテゴリ: M17N
  • 対象バージョン: current: 2.2.0
  • ruby -v: r44601
  • Backport: 1.9.3: UNKNOWN, 2.0.0: UNKNOWN, 2.1: REQUIRED ---------------------------------------- String#codepointsUTF-16UTF-32でのBOMを考慮していません。
 $ ruby -e 'puts "%x" % "\u{feff}".encode("UTF-16BE").force_encoding("UTF-16").codepoints'
 feff
 $ ruby -e 'puts "%x" % "\u{feff}".encode("UTF-16LE").force_encoding("UTF-16").codepoints'
 fffe

String#ordなども同様です。

 $ ruby -e 'printf "%x\n", "\u{feff}".encode("UTF-16BE").force_encoding("UTF-16").ord'
 feff
 $ ruby -e 'printf "%x\n", "\u{feff}".encode("UTF-16LE").force_encoding("UTF-16").ord'
 fffe

--
http://bugs.ruby-lang.org/

#2 Updated by Nobuyoshi Nakada over 1 year ago

  • Status changed from Open to Closed
  • % Done changed from 0 to 100

Applied in changeset r44606.


string.c: respect BOM

  • string.c (get_encoding): respect BOM on pseudo encodings. [Bug #9415]

#3 Updated by Nobuyoshi Nakada over 1 year ago

  • Related to Bug #8940: printing UTF-32 crashs ruby added

#4 Updated by Usaku NAKAMURA about 1 year ago

  • Backport changed from 1.9.3: UNKNOWN, 2.0.0: UNKNOWN, 2.1: REQUIRED to 1.9.3: REQUIRED, 2.0.0: REQUIRED, 2.1: REQUIRED

#5 Updated by Yui NARUSE about 1 year ago

  • Backport changed from 1.9.3: REQUIRED, 2.0.0: REQUIRED, 2.1: REQUIRED to 1.9.3: REQUIRED, 2.0.0: REQUIRED, 2.1: DONE

r45074

Also available in: Atom PDF