Bug #10149
Updated by nobu (Nobuyoshi Nakada) over 9 years ago
This bug is confirmed on 2.1.2p95 There are (at least) two valid euc-kr characters that do not get converted to utf-8 properly **1. "\xA2\xE6" should convert to U+20AC (Euro Sign)** Current behavior: ~~~ruby irb(main):001:0> "\xA2\xE6".encode('UTF-8', 'EUC-KR') Encoding::UndefinedConversionError: "\xA2\xE6" from EUC-KR to UTF-8 ~~~ **2. "\xA2\xE7" should convert to U+00AE (Registered Sign)** Current behavior: ~~~ruby irb(main):002:0> "\xA2\xE7".encode('UTF-8', 'EUC-KR') Encoding::UndefinedConversionError: "\xA2\xE7" from EUC-KR to UTF-8 ~~~ I confirmed both characters convert correctly on python: ~~~python >>> "\xA2\xE7".decode('euc-kr') u'\xae' ~~~ I am guessing this is because these two characters are missing in this mapping: http://svn.ruby-lang.org/repos/ruby/trunk/enc/trans/euckr-tbl.rb