Project

General

Profile

Bug #10149

Updated by nobu (Nobuyoshi Nakada) over 9 years ago

This bug is confirmed on 2.1.2p95 
 There are (at least) two valid euc-kr characters that do not get converted to utf-8 properly 

 **1. "\xA2\xE6" should convert to U+20AC (Euro Sign)** 
 Current behavior: 

 ~~~ruby 
 irb(main):001:0> "\xA2\xE6".encode('UTF-8', 'EUC-KR') 
 Encoding::UndefinedConversionError: "\xA2\xE6" from EUC-KR to UTF-8 
 ~~~ 

 **2. "\xA2\xE7" should convert to U+00AE (Registered Sign)** 
 Current behavior: 

 ~~~ruby 
 irb(main):002:0> "\xA2\xE7".encode('UTF-8', 'EUC-KR') 
 Encoding::UndefinedConversionError: "\xA2\xE7" from EUC-KR to UTF-8 
 ~~~ 

 I confirmed both characters convert correctly on python: 

 ~~~python 
 >>> "\xA2\xE7".decode('euc-kr') 
 u'\xae' 
 ~~~ 

 I am guessing this is because these two characters are missing in this mapping:    http://svn.ruby-lang.org/repos/ruby/trunk/enc/trans/euckr-tbl.rb

Back