Feature #10002

String swapcase

Added by Andreas Runk about 1 year ago. Updated 11 months ago.

[ruby-core:63475]
Status:Open
Priority:Normal
Assignee:-

Description

Hi, the ruby version 2.1.2 has a problem with the .swapcase function and german letters.
E.g. "ä".swapcase does return "ä" but should be "Ä".


Related issues

Related to CommonRuby - Feature #10085: Add non-ASCII case conversion to String#upcase/downcase/swapcase/capitalize Open 07/23/2014

History

#1 Updated by Nobuyoshi Nakada about 1 year ago

  • Tracker changed from Bug to Feature

#2 Updated by Yukihiro Matsumoto about 1 year ago

The current implementation of case conversion methods in String class only understands ASCII characters.
We'd like to enhance it when possible. But we have to know how each character should be converted.
For example, how should we convert "ß" (eszett)?

Matz.

#3 Updated by Yui NARUSE about 1 year ago

At this time, ffi-icu or twitter-text-rb is useful.

#4 Updated by Dāvis Mosāns about 1 year ago

It have been already figured out by Unicode Standard, so just have to implement it. Look at Default Case Algorithms in section 3.13 and Case Mappings in section 5.18. Mappings can be viewed in SpecialCasing.txt (and UnicodeData.txt) also CaseFolding.txt could be useful.

From there "ß" (LATIN SMALL LETTER SHARP S) in uppercase would be "SS" (LATIN CAPITAL LETTER S) and it's user's responsibility to know that generally they are not reversible.

Also useful to read Character Properties, Case Mappings & Names FAQ

#5 Updated by Zachary Scott about 1 year ago

We should delegate to @emboss everytime we need to convert ß...

#6 Updated by Shyouhei Urabe 12 months ago

We are talking about swapcase, not folding. The "generally they are not reversible" you say is the difficulty we are facing here. Also as you cited CaseFolding.txt, you should have been aware of type T folding, which is impossible without locale information.

If you think you can implement it, please show us.

Dāvis Mosāns wrote:

It have been already figured out by Unicode Standard, so just have to implement it. Look at Default Case Algorithms in section 3.13 and Case Mappings in section 5.18. Mappings can be viewed in SpecialCasing.txt (and UnicodeData.txt) also CaseFolding.txt could be useful.

From there "ß" (LATIN SMALL LETTER SHARP S) in uppercase would be "SS" (LATIN CAPITAL LETTER S) and it's user's responsibility to know that generally they are not reversible.

Also useful to read Character Properties, Case Mappings & Names FAQ

#7 Updated by Martin Dürst 11 months ago

  • Related to Feature #10085: Add non-ASCII case conversion to String#upcase/downcase/swapcase/capitalize added

Also available in: Atom PDF