Project

General

Profile

Bug #10097

Updated by nobu (Nobuyoshi Nakada) almost 10 years ago

By chance I had a look at enc/iso_8859_1.c and found 

 ~~~C 
 
    ENC_REPLICATE("Windows-1252", "ISO-8859-1") 
 ~~~ 
 on line 288. But this does not work for case folding: 

 ~~~ruby 
 # http://en.wikipedia.org/wiki/Windows-1252 
 s1 = "\u0160".encode 'windows-1252' # 'Š' 
 r1 = Regexp.new("\u0161".encode('windows-1252'), Regexp::IGNORECASE) # /š/i 
 s1 =~ r1 
    # => nil 
 s2 = "\u0178".encode 'windows-1252' # 'Ÿ' 
 r2 = Regexp.new("\u00FF".encode('windows-1252'), Regexp::IGNORECASE) # /ÿ/i 
 s2 =~ r2 
    # => nil 
 s3 = "\u00C0".encode 'windows-1252' # 'À' 
 r3 = Regexp.new("\u00E0".encode('windows-1252'), Regexp::IGNORECASE) # /à/i 
 s3 =~ r3 
    # => 0 
 ~~~ 

 So case-insensitive matching works when both characters are in iso-8859-1, but not when one (ÿŸ) or both (ŠšŽžŒœ) characters are not in iso-8859-1.

Back