Project

General

Profile

Actions

Feature #18563

closed

Add "graphemes" and "each_grapheme" aliases

Added by shan (Shannon Skipper) almost 3 years ago. Updated almost 3 years ago.

Status:
Closed
Assignee:
-
Target version:
-
[ruby-core:107416]

Description

https://bugs.ruby-lang.org/issues/13780#note-10

grapheme sounds like an element in the grapheme cluster. How about each_grapheme_cluster?
If everyone gets used to the grapheme as an alias of grapheme cluster, we'd love to add an alias each_grapheme.

Matz.

Languages that have added grapheme cluster support seem to be almost exclusively opting for the shorter "graphemes" alias as a part that stands for the whole.

  • JavaScript/TypeScript grapheme-splitter library: splitGraphemes
  • PHP: grapheme_extract
  • Zig ziglyph library: GraphemeIterator
  • Golang uniseg library: NewGraphemes
  • Matlab: splitGraphemes
  • Python grapheme library: graphemes
  • Elixir: graphemes
  • Crystal uni_text_seg library: graphemes
  • Nim nim-graphemes library: graphemes
  • Rust unicode-segmentation library: graphemes

Now that some time has passed and the "graphemes" alias for "grapheme clusters" has been fairly widely adopted by languages and libraries, I'd like to go ahead and propose a graphemes alias for grapheme_clusters and an each_grapheme alias for each_grapheme_cluster.


Related issues 1 (0 open1 closed)

Related to Ruby master - Feature #13780: String#each_graphemeClosednaruse (Yui NARUSE)Actions
Actions #1

Updated by mame (Yusuke Endoh) almost 3 years ago

Updated by mame (Yusuke Endoh) almost 3 years ago

  • Description updated (diff)

(I have added to the description an url to matz's original statement)

Actions #3

Updated by znz (Kazuhiro NISHIYAMA) almost 3 years ago

  • Subject changed from Add "graphemes" and "each_grapheme aliases to Add "graphemes" and "each_grapheme" aliases

Updated by nobu (Nobuyoshi Nakada) almost 3 years ago

How about letters and each_letter?

Updated by matz (Yukihiro Matsumoto) almost 3 years ago

  • Status changed from Open to Closed

For the record, "Grapheme" and "Grapheme cluster" are different concepts. If we call them "grapheme", It's kind of like calling "Wikipedia" as "Wiki".
Until Unicode consortium defines a shorter name for them or the convention calling them "grapheme" become popular as common sense, we don't provide such aliases. So my opinion has not been changed since.

Short answer: "not yet".

Matz.

Updated by Dan0042 (Daniel DeLorme) almost 3 years ago

nobu (Nobuyoshi Nakada) wrote in #note-4:

How about letters and each_letter?

I like the general idea, but to me "letters" mean \p{L}
[retracted part]

Or how about characters and each_character?

Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0Like0Like0