Project

General

Profile

Actions

Feature #2034

closed

Consider the ICU Library for Improving and Expanding Unicode Support

Added by runpaint (Run Paint Run Run) about 15 years ago. Updated about 7 years ago.

Status:
Rejected
Target version:
[ruby-core:25306]

Description

=begin
Has consideration been recently given to employing the ICU library (http://site.icu-project.org/) in Ruby? The bindings are in C and the library mature. My ignorance of the Ruby source not withstanding, this would allow existing String methods, among others, to support non-ASCII characters in an incremental manner.

For a trivial example, consider String#to_i. It currently understands only ASCII characters which represent digits. ICU provides a u_charDigitValue(code_point) function which returns the integer corresponding to the given Unicode codepoint. Were String#to_i to use this, it would work with non-ASCII counting systems, thus removing at least one of the "as long as it's ASCII" caveats currently associated with String methods.

More generally, if it's desirable for String methods to properly support Unicode, and if the principle barrier is the difficulty of the implementation, then might there be at least a partial solution in marrying Ruby with ICU?

If ICU is unfeasible, I'd appreciate understanding why. There are multiple approaches to what I term the second phase of Unicode support in Ruby, and it will be easier to choose between them if I understand the constraints. :-) (Of course, if a direction has already been determined, and work on it is underway, I will gladly bow out ;-)).
=end


Related issues 2 (0 open2 closed)

Related to Ruby master - Feature #10084: Add Unicode String Normalization to String classClosedduerst (Martin Dürst)Actions
Related to Ruby master - Feature #10085: Add non-ASCII case conversion to String#upcase/downcase/swapcase/capitalizeClosedduerst (Martin Dürst)Actions
Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0