Project

General

Profile

Actions

Bug #1449

closed

[REXML] detected encoding isn't used correctly

Added by kou (Kouhei Sutou) almost 12 years ago. Updated almost 10 years ago.

Status:
Closed
Priority:
Normal
Target version:
ruby -v:
ruby 1.9.2dev (2009-05-09 trunk 23374) [x86_64-linux]
Backport:
[ruby-core:23404]

Description

=begin
REXML::Source can detect source encoding by XML declaration. REXML::IOSource can also detect it but it's not used correctly.

REXML::IOSource uses detected encoding to convert read data from @source. If detected encoding is UTF-8 read data isn't converted. (ref. rexml/encodings/UTF-8.rb) If detected encoding is UTF-8 but @source.external_encoding isn't UTF-8, it may cause a problem.

If @source.external_encoding is ASCII-8BIT and @source only has ASCII data, it doesn't cause any problems. If @source.external_encoding is ASCII-8BIT and @source has non-ASCII data, it causes a problem. In the case, "@buffer << read_data_from_source" causes an Encoding::CompatibilityError. It breaks correct XML parsing.
=end


Files

ruby19-rexml-encoding-mismatch.diff (2.89 KB) ruby19-rexml-encoding-mismatch.diff a test case for the problem and a patch to fix the problem. kou (Kouhei Sutou), 05/09/2009 01:38 PM
Actions #1

Updated by yugui (Yuki Sonoda) almost 12 years ago

  • Assignee set to ser (Sean Russell)
  • Target version set to 1.9.1

=begin

=end

Actions #2

Updated by naruse (Yui NARUSE) about 11 years ago

  • Status changed from Open to Closed
  • % Done changed from 0 to 100

=begin
This issue was solved with changeset r27342.
Kouhei, thank you for reporting this issue.
Your contribution to Ruby is greatly appreciated.
May Ruby be with you.

=end

Actions

Also available in: Atom PDF