Bug #1449
[REXML] detected encoding isn't used correctly
| Status: | Closed | Start date: | 05/09/2009 | |
|---|---|---|---|---|
| Priority: | Normal | Due date: | ||
| Assignee: | % Done: | 100% |
||
| Category: | lib | |||
| Target version: | 1.9.1 | |||
| ruby -v: | ruby 1.9.2dev (2009-05-09 trunk 23374) [x86_64-linux] |
Description
REXML::Source can detect source encoding by XML declaration. REXML::IOSource can also detect it but it's not used correctly. REXML::IOSource uses detected encoding to convert read data from @source. If detected encoding is UTF-8 read data isn't converted. (ref. rexml/encodings/UTF-8.rb) If detected encoding is UTF-8 but @source.external_encoding isn't UTF-8, it may cause a problem. If @source.external_encoding is ASCII-8BIT and @source only has ASCII data, it doesn't cause any problems. If @source.external_encoding is ASCII-8BIT and @source has non-ASCII data, it causes a problem. In the case, "@buffer << read_data_from_source" causes an Encoding::CompatibilityError. It breaks correct XML parsing.
Associated revisions
* lib/rexml/source.rb: force_encoding("UTF-8") when the input
is already UTF-8. patched by Kouhei Sutou [ruby-core:23404]
History
Updated by Yuki Sonoda over 2 years ago
- Assignee set to Sean Russell
- Target version set to 1.9.1
Updated by Yui NARUSE almost 2 years ago
- Status changed from Open to Closed
- % Done changed from 0 to 100
This issue was solved with changeset r27342. Kouhei, thank you for reporting this issue. Your contribution to Ruby is greatly appreciated. May Ruby be with you.