Project

General

Profile

Bug #10418

REXML's encoding is broken if reading UTF-16 XML and Encondig.default_internal is set

Added by usa (Usaku NAKAMURA) almost 5 years ago. Updated almost 5 years ago.

Status:
Closed
Priority:
Normal
Target version:
[ruby-dev:48686]

Description

Encoding.default_internalがセットされている状態でREXMLにUTF-16なIOを食わせると、REXML::Document#encodingがUTF-16ではなくEncoding.default_internalになります。
以下パッチ。

Index: lib/rexml/source.rb
===================================================================
--- lib/rexml/source.rb (revision 48095)
+++ lib/rexml/source.rb (working copy)
@@ -285,7 +285,7 @@
       case @encoding
       when "UTF-16BE", "UTF-16LE"
         @source.binmode
-        @source.set_encoding(@encoding)
+        @source.set_encoding(@encoding, @encoding)
       end
       @line_break = encode(">")
       @pending_buffer, @buffer = @buffer, ""
Index: test/rexml/test_encoding.rb
===================================================================
--- test/rexml/test_encoding.rb (revision 48095)
+++ test/rexml/test_encoding.rb (working copy)
@@ -91,8 +91,18 @@
       utf16 = File.open(fixture_path("ticket_110_utf16.xml")) do |f|
         REXML::Document.new(f)
       end
-      assert_equal(utf16.encoding, "UTF-16")
+      assert_equal("UTF-16", utf16.encoding)
       assert( utf16[0].kind_of?(REXML::XMLDecl))
     end
+
+    def test_default_internal_with_utf16
+      orig_internal = ::Encoding.default_internal
+      ::Encoding.default_internal = 'utf-8'
+      begin
+        test_ticket_110
+      ensure
+        ::Encoding.default_internal = orig_internal
+      end
+    end
   end
 end


Related issues

Related to Ruby master - Bug #10417: IO#set_encoding without int_enc doesn't keep current internal encodingOpen10/22/2014Actions

Associated revisions

Revision 93647e81
Added by kou (Kouhei Sutou) almost 5 years ago

  • lib/rexml/source.rb (REXML::IOSource#encoding_updated): Fix a
    bug that can't parse XML correctly when
    Encoding.default_internal is different with XML
    encoding. REXML::Source converts XML encoding on read. So IO
    should not convert XML encoding.
    Based on patch by NAKAMURA Usaku.
    [ruby-dev:48686] [Bug #10418]

  • test/rexml/test_encoding.rb
    (REXMLTests::EncodingTester#test_parse_utf16_with_utf8_default_internal):
    Add the for the above case.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@48109 b2dd03c8-39d4-4d8f-98ff-823fe69b080e

Revision 48109
Added by kou (Kouhei Sutou) almost 5 years ago

  • lib/rexml/source.rb (REXML::IOSource#encoding_updated): Fix a
    bug that can't parse XML correctly when
    Encoding.default_internal is different with XML
    encoding. REXML::Source converts XML encoding on read. So IO
    should not convert XML encoding.
    Based on patch by NAKAMURA Usaku.
    [ruby-dev:48686] [Bug #10418]

  • test/rexml/test_encoding.rb
    (REXMLTests::EncodingTester#test_parse_utf16_with_utf8_default_internal):
    Add the for the above case.

Revision 48109
Added by kou (Kouhei Sutou) almost 5 years ago

  • lib/rexml/source.rb (REXML::IOSource#encoding_updated): Fix a
    bug that can't parse XML correctly when
    Encoding.default_internal is different with XML
    encoding. REXML::Source converts XML encoding on read. So IO
    should not convert XML encoding.
    Based on patch by NAKAMURA Usaku.
    [ruby-dev:48686] [Bug #10418]

  • test/rexml/test_encoding.rb
    (REXMLTests::EncodingTester#test_parse_utf16_with_utf8_default_internal):
    Add the for the above case.

Revision 48109
Added by kou (Kouhei Sutou) almost 5 years ago

  • lib/rexml/source.rb (REXML::IOSource#encoding_updated): Fix a
    bug that can't parse XML correctly when
    Encoding.default_internal is different with XML
    encoding. REXML::Source converts XML encoding on read. So IO
    should not convert XML encoding.
    Based on patch by NAKAMURA Usaku.
    [ruby-dev:48686] [Bug #10418]

  • test/rexml/test_encoding.rb
    (REXMLTests::EncodingTester#test_parse_utf16_with_utf8_default_internal):
    Add the for the above case.

Revision 48109
Added by kou (Kouhei Sutou) almost 5 years ago

  • lib/rexml/source.rb (REXML::IOSource#encoding_updated): Fix a
    bug that can't parse XML correctly when
    Encoding.default_internal is different with XML
    encoding. REXML::Source converts XML encoding on read. So IO
    should not convert XML encoding.
    Based on patch by NAKAMURA Usaku.
    [ruby-dev:48686] [Bug #10418]

  • test/rexml/test_encoding.rb
    (REXMLTests::EncodingTester#test_parse_utf16_with_utf8_default_internal):
    Add the for the above case.

Revision 48109
Added by kou (Kouhei Sutou) almost 5 years ago

  • lib/rexml/source.rb (REXML::IOSource#encoding_updated): Fix a
    bug that can't parse XML correctly when
    Encoding.default_internal is different with XML
    encoding. REXML::Source converts XML encoding on read. So IO
    should not convert XML encoding.
    Based on patch by NAKAMURA Usaku.
    [ruby-dev:48686] [Bug #10418]

  • test/rexml/test_encoding.rb
    (REXMLTests::EncodingTester#test_parse_utf16_with_utf8_default_internal):
    Add the for the above case.

Revision 48109
Added by kou (Kouhei Sutou) almost 5 years ago

  • lib/rexml/source.rb (REXML::IOSource#encoding_updated): Fix a
    bug that can't parse XML correctly when
    Encoding.default_internal is different with XML
    encoding. REXML::Source converts XML encoding on read. So IO
    should not convert XML encoding.
    Based on patch by NAKAMURA Usaku.
    [ruby-dev:48686] [Bug #10418]

  • test/rexml/test_encoding.rb
    (REXMLTests::EncodingTester#test_parse_utf16_with_utf8_default_internal):
    Add the for the above case.

History

Updated by usa (Usaku NAKAMURA) almost 5 years ago

  • Related to Bug #10417: IO#set_encoding without int_enc doesn't keep current internal encoding added

Updated by nobu (Nobuyoshi Nakada) almost 5 years ago

  • Description updated (diff)

-wで警告が出そうなので、EnvUtil.with_default_internalを使うかこれに相当するようにしたほうがいいんじゃないでしょうか。

Updated by kou (Kouhei Sutou) almost 5 years ago

  • Status changed from Assigned to Closed
  • % Done changed from 0 to 100

Applied in changeset r48109.


  • lib/rexml/source.rb (REXML::IOSource#encoding_updated): Fix a
    bug that can't parse XML correctly when
    Encoding.default_internal is different with XML
    encoding. REXML::Source converts XML encoding on read. So IO
    should not convert XML encoding.
    Based on patch by NAKAMURA Usaku.
    [ruby-dev:48686] [Bug #10418]

  • test/rexml/test_encoding.rb
    (REXMLTests::EncodingTester#test_parse_utf16_with_utf8_default_internal):
    Add the for the above case.

Updated by kou (Kouhei Sutou) almost 5 years ago

EnvUtil.with_default_internalを使ったりしながら取り込みました!

Also available in: Atom PDF