Project

General

Profile

Actions

Bug #21450

closed

Inconsistent `upcase` between `String` and `Symbol`

Added by Stranger6667 (Dmitry Dygalo) about 1 month ago. Updated about 1 month ago.

Status:
Closed
Assignee:
-
Target version:
-
[ruby-core:122582]

Description

Behavior for Symbol#upcase and String#upcase differs for i character if the :turkic option is present

I'd expect val.upcase(:turkic) behaves consistently for both cases:

'i'.upcase(:turkic)
# "İ"   with dot
:i.upcase(:turkic)
# :I    no dot

However, when a non-ASCII character is present, then the case mapping on Symbol works the same way as with String:

:iФ.upcase(:turkic)
# :İФ   # with dot
'iФ'.upcase(:turkic)
# "İФ"  # with dot 

Actions #1

Updated by Stranger6667 (Dmitry Dygalo) about 1 month ago

  • Description updated (diff)
Actions #2

Updated by Stranger6667 (Dmitry Dygalo) about 1 month ago

  • Description updated (diff)

Updated by nobu (Nobuyoshi Nakada) about 1 month ago · Edited

  • Status changed from Open to Closed

That difference is because of the difference of encodings.

The string "i" is UTF-8, even it contains ASCII 7bit characters only, because the source encoding is defaulted to UTF-8.
On the other hand, the encoding of :i, the symbol that contains ASCII 7bit characters only is US-ASCII.

"i".encoding #=> #<Encoding:UTF-8>
:i.encoding  #=> #<Encoding:US-ASCII>

The string in US-ASCII does the same behavior as the symbol.

"i".encode("us-ascii").upcase(:turkic) #=> "I"
Actions

Also available in: Atom PDF

Like0
Like0Like0Like0