Misc #20994
Updated by cfis (Charlie Savage) 7 days ago
As part of Rice (Ruby C++ bindings), I am experimenting with using unicode characters to make more readable class names (see https://ruby-rice.github.io/4.x/stl/stl.html#automatically-generated-ruby-classes). I am experimenting with class names like this: ``` Map≺string≺char≻٬vector≺complex≺double≻≻≻` ``` Where < and > are actually Unicode characters precede (\u227A) and succeed (\u227B). In Ruby this works fine: ``` ruby irb(main):01> class Map≺string≺char≻٬vector≺complex≺double≻≻≻ irb(main):02> end => nil irb(main):013> Map≺string≺char≻٬vector≺complex≺double≻≻≻.new => #<Map≺string≺char≻٬vector≺complex≺double≻≻≻:0x0000021114674c98> ``` However, this fails using the Ruby C API `rb_define_class`. Passing a `char*` that is utf8 encoded fails because `rb_define_class` calls `rb_intern` which calls `rb_intern2` which forces the use of ascii encoding (see https://github.com/ruby/ruby/blob/5fec9308320e8b377681ef19b0cd46d53f94e8ac/symbol.c#L818). I thought I might be able to define the class using ascii characters, and then call `rb_define_const` to add in a utf8 encoded name, but that also has the same problem. My question - how does one create classes class names that have non ascii characters in their names them via the C API?