Bug #1681: Integer#chr Should Infer Encoding of Given Codepoint - Ruby - Ruby Issue Tracking System

Actions

Copy link

Bug #1681

closed

Integer#chr Should Infer Encoding of Given Codepoint

Bug #1681: Integer#chr Should Infer Encoding of Given Codepoint

Added by runpaint (Run Paint Run Run) over 16 years ago. Updated over 14 years ago.

Status:

Closed

Assignee:

Target version:

ruby -v:

ruby 1.9.2dev (2009-06-21 trunk 23774) [i686-linux]

Backport:

[ruby-core:23997]

Description

=begin
String#ord and Integer#chr are symmetrical operations on ASCII Strings:

 'a'.ord.chr   #=> "a"

But Integer#chr fails to round-trip when the given codepoint is outside the range of ASCII:

 "\u{2563}".ord.chr #=> RangeError: 9571 out of char range

To fix this, the codepoint's encoding needs to be specified:

 "\u{2563}".ord.chr('utf-8')  #=> "╣"

This seems needlessly verbose given that Ruby already knows that my source encoding is UTF-8. I suggest, then, that, when invoked with no argument, Integer#chr displays the given codepoint w.r.t to the current encoding, raising a RangeError only if the codepoint is out-of-bounds for this inferred encoding.
=end

Updated by nobu (Nobuyoshi Nakada) over 16 years ago Actions
Copy link
#1

=begin
Hi,

At Wed, 24 Jun 2009 06:42:29 +0900,
Run Paint Run Run wrote in [ruby-core:23997]:

This seems needlessly verbose given that Ruby already knows
that my source encoding is UTF-8.

It's irrelevant to source encoding. A possiblity would be
Encoding.default_internal?

--
Nobu Nakada

=end

Updated by runpaint (Run Paint Run Run) over 16 years ago Actions
Copy link
#2

=begin

This seems needlessly verbose given that Ruby already knows
that my source encoding is UTF-8.

It's irrelevant to source encoding. A possiblity would be
Encoding.default_internal?

Indeed; my mistake. :-)

--
Run Paint Run Run

=end

Updated by matz (Yukihiro Matsumoto) over 16 years ago Actions
Copy link
#3

=begin
Hi,

In message "Re: [ruby-core:24001] Re: [Bug #1681] Integer#chr Should Infer Encoding of Given Codepoint"
on Wed, 24 Jun 2009 09:54:06 +0900, Run Paint Run Run runrun@runpaint.org writes:
|
|>> This seems needlessly verbose given that Ruby already knows
|>> that my source encoding is UTF-8.
|>
|> It's irrelevant to source encoding. A possiblity would be
|> Encoding.default_internal?
|
|Indeed; my mistake. :-)

Source encoding may be different from default internal encoding.
Since codepoint number does not contain any encoding information,
there's information loss. I am not sure it is OK to use possibly
wrong encoding information (default internal), even as a default.

I'd like to hear opinion from others.

						matz.

=end

Updated by duerst (Martin Dürst) over 16 years ago Actions
Copy link
#4

=begin
We have String#encode (without any arguments), which transcodes to
default_internal (and in addition, doesn't raise an exception for
invalid byte sequences,..., which may be a security issue), so I don't
think using Integer#chr with a default encoding of default_internal
would be such a big problem.

Regards, Martin.

On 2009/06/25 18:06, Yukihiro Matsumoto wrote:

Hi,

In message "Re: [ruby-core:24001] Re: [Bug #1681] Integer#chr Should Infer Encoding of Given Codepoint"
on Wed, 24 Jun 2009 09:54:06 +0900, Run Paint Run Runrunrun@runpaint.org writes:
|
|>> This seems needlessly verbose given that Ruby already knows
|>> that my source encoding is UTF-8.
|>
|> It's irrelevant to source encoding. A possiblity would be
|> Encoding.default_internal?
|
|Indeed; my mistake. :-)

Source encoding may be different from default internal encoding.
Since codepoint number does not contain any encoding information,
there's information loss. I am not sure it is OK to use possibly
wrong encoding information (default internal), even as a default.

I'd like to hear opinion from others.
 					matz.

--
#-# Martin J. Dürst, Professor, Aoyama Gakuin University
#-# http://www.sw.it.aoyama.ac.jp mailto:duerst@it.aoyama.ac.jp

=end

Updated by matz (Yukihiro Matsumoto) over 16 years ago Actions
Copy link
#5

Status changed from Open to Closed
% Done changed from 0 to 100

=begin
Applied in changeset r23865.
=end

Actions

Copy link

Also available in: PDF Atom

Project

General

Profile

Ruby

Tags

Custom queries

Bug #1681

Integer#chr Should Infer Encoding of Given Codepoint

Updated by nobu (Nobuyoshi Nakada) over 16 years ago Actions
Copy link
#1

Updated by runpaint (Run Paint Run Run) over 16 years ago Actions
Copy link
#2

Updated by matz (Yukihiro Matsumoto) over 16 years ago Actions
Copy link
#3

Updated by duerst (Martin Dürst) over 16 years ago Actions
Copy link
#4

Updated by matz (Yukihiro Matsumoto) over 16 years ago Actions
Copy link
#5

Project

General

Profile

Ruby

Tags

Custom queries

Bug #1681

Integer#chr Should Infer Encoding of Given Codepoint

Updated by nobu (Nobuyoshi Nakada) over 16 years ago ActionsCopy link #1

Updated by runpaint (Run Paint Run Run) over 16 years ago ActionsCopy link #2

Updated by matz (Yukihiro Matsumoto) over 16 years ago ActionsCopy link #3

Updated by duerst (Martin Dürst) over 16 years ago ActionsCopy link #4

Updated by matz (Yukihiro Matsumoto) over 16 years ago ActionsCopy link #5

Updated by nobu (Nobuyoshi Nakada) over 16 years ago Actions
Copy link
#1

Updated by runpaint (Run Paint Run Run) over 16 years ago Actions
Copy link
#2

Updated by matz (Yukihiro Matsumoto) over 16 years ago Actions
Copy link
#3

Updated by duerst (Martin Dürst) over 16 years ago Actions
Copy link
#4

Updated by matz (Yukihiro Matsumoto) over 16 years ago Actions
Copy link
#5