Project

General

Profile

Actions

Feature #2061

closed

Named Unicode Character Escapes

Feature #2061: Named Unicode Character Escapes

Added by runpaint (Run Paint Run Run) about 16 years ago. Updated over 14 years ago.

Status:
Closed
Target version:
-

Description

=begin
I suggest the addition of a \N{name} escape where name is the name of a Unicode character. It would resolve to the corresponding codepoint. 'N' is chosen because it's used by both Perl and Python for the same purpose.

This promotes more readable code compared to \u{} escapes because \N{WHITE SMILING FACE} is self-documenting whereas \u263A isn't. It can even be useful when the source encoding is UTF-8 because the meaning of unfamiliar glyphs is often clearer when they are named.

They should:

  • Normalise the name by converting to uppercase and replacing underscores with spaces.
  • Force the string's encoding to UTF-8 in the same fashion as \u{}.
  • Optionally support Perl's aliases for names containing parentheses as detailed in http://perldoc.perl.org/charnames.html .
  • Work inside regexp literals.

I'd hoped to write this patch myself, but was unable. I'm happy to update tool/enc-unicode.rb and RubySpec, if that would help.
=end

Updated by naruse (Yui NARUSE) about 16 years ago Actions #1

  • Status changed from Open to Assigned
  • Assignee set to naruse (Yui NARUSE)

=begin
I agree this.
If you can't change regexp or other core things, I can do.

You may already know but for others, related documents are:
http://www.unicode.org/Public/5.1.0/ucd/UCD.html#Name
http://www.unicode.org/Public/5.1.0/ucd/NamesList.html
http://www.unicode.org/reports/tr18/#Name_Properties
=end

Updated by runpaint (Run Paint Run Run) about 16 years ago Actions #2

=begin

If you can't change regexp or other core things, I can do.

Thank you. I made a couple of attempts but made no progress. :-(
=end

Updated by naruse (Yui NARUSE) about 16 years ago Actions #3

=begin

  • Normalise the name by converting to uppercase and replacing underscores with spaces.
    done this for properties in r24836.
    =end

Updated by naruse (Yui NARUSE) about 16 years ago Actions #4

  • Status changed from Assigned to Closed

=begin
Reopen this when you done it.
=end

Updated by runpaint (Run Paint Run Run) about 16 years ago Actions #5

=begin
I must have been unclear: I am not able to implement this feature. It requires changes over multiple source files and a familiarity with the lexxer. In addition, it is not clear to me what the ideal data structure is.
=end

Actions

Also available in: PDF Atom