Project

General

Profile

Misc #20994

Updated by cfis (Charlie Savage) 7 days ago

As part of Rice (Ruby C++ bindings), I am experimenting with using unicode characters to make more readable class names (see https://ruby-rice.github.io/4.x/stl/stl.html#automatically-generated-ruby-classes). 

 I am experimenting with class names like this: 

 ``` 
 Map≺string≺char≻٬vector≺complex≺double≻≻≻` 
 ```  

 Where < and > are actually Unicode characters precede (\u227A) and succeed (\u227B). 

 In Ruby this works fine: 

 ``` ruby 
 irb(main):01> class Map≺string≺char≻٬vector≺complex≺double≻≻≻ 
 irb(main):02> end 
 => nil 
 irb(main):013> Map≺string≺char≻٬vector≺complex≺double≻≻≻.new 
 => #<Map≺string≺char≻٬vector≺complex≺double≻≻≻:0x0000021114674c98> 
 ``` 

 However, this fails using the Ruby C API `rb_define_class`. `define_method`. Passing a `char*` that is utf8 encoded fails because `rb_define_class` `define_method` calls `rb_intern` which calls `rb_intern2` which forces the use of ascii encoding (see https://github.com/ruby/ruby/blob/5fec9308320e8b377681ef19b0cd46d53f94e8ac/symbol.c#L818). 

 I thought I might be able to define the class using ascii characters, and then call `rb_define_const` to add in a utf8 encoded name, but that also has the same problem. 

 My question - how does one create class names that have non ascii characters in them via the C API?

Back