Project

General

Profile

Actions

Backport #5287

closed

1.9.3 - Interpolation in a string causes the string's encoding to be set to ASCII-8BIT

Added by jonleighton (Jon Leighton) over 12 years ago. Updated over 12 years ago.

Status:
Closed
[ruby-core:39309]

Description

There appears to be a bug with the encoding of interpolated strings on 1.9.3.

Here is a comparison of versions:

1.9.2

ruby-1.9.2-p290 :001 > a = ""
=> ""
ruby-1.9.2-p290 :002 > a.encoding
=> #Encoding:UTF-8
ruby-1.9.2-p290 :003 > "#{a}".encoding
=> #Encoding:UTF-8

1.9.3-head

ruby-1.9.3-head :004 > a = ""
=> ""
ruby-1.9.3-head :005 > a.encoding
=> #Encoding:UTF-8
ruby-1.9.3-head :006 > "#{a}".encoding
=> #Encoding:ASCII-8BIT

ruby-head

ruby-head :003 > a = ""
=> ""
ruby-head :004 > a.encoding
=> #Encoding:UTF-8
ruby-head :005 > "#{a}".encoding
=> #Encoding:UTF-8


Related issues 1 (0 open1 closed)

Is duplicate of Ruby master - Bug #5126: Unicode character classes interpolated into regex throws exceptionClosed08/01/2011Actions

Updated by jonleighton (Jon Leighton) over 12 years ago

To be clear about the version tested:

$ ruby -v
ruby 1.9.3dev (2011-09-05 revision 33190) [x86_64-linux]

Actions #2

Updated by nobu (Nobuyoshi Nakada) over 12 years ago

  • Tracker changed from Bug to Backport
  • Project changed from Ruby master to Backport193
  • Status changed from Open to Assigned
  • Assignee set to naruse (Yui NARUSE)
  • Priority changed from Normal to 5

Backport r32791.

Updated by naruse (Yui NARUSE) over 12 years ago

  • Status changed from Assigned to Closed

Backported in r33236.

Updated by aprescott (Adam Prescott) over 12 years ago

On Wed, Sep 7, 2011 at 12:20 AM, Adam Prescott wrote:

Since "#{a}" is actually a new string, doesn't it make sense that its
encoding should be the default internal encoding? I can see "#{a}" being
used with the encoding change actually expected.

I guess "no" is the answer?

What about "foo#{a}bar"? Would that have the same encoding result as
"#{a}", or is the latter just a special case? (Either choice seems
counterintuitive to me.)

Updated by naruse (Yui NARUSE) over 12 years ago

Adam Prescott wrote:

On Wed, Sep 7, 2011 at 12:20 AM, Adam Prescott wrote:

Since "#{a}" is actually a new string, doesn't it make sense that its
encoding should be the default internal encoding? I can see "#{a}" being
used with the encoding change actually expected.

I guess "no" is the answer?

default_internal doesn't effect on this situation.
"#{a}" is considered as ` s = a.to_s
So "no" is the answer, s's encoding depends a's encoding.

What about "foo#{a}bar"? Would that have the same encoding result as
"#{a}", or is the latter just a special case? (Either choice seems
counterintuitive to me.)

"foo#{a}bar" is considered as ` s = "foo"; s.concat(a.to_s); s.concat("bar").
So the resulted s's encoding depends "foo".

Updated by aprescott (Adam Prescott) over 12 years ago

On Fri, Sep 9, 2011 at 3:07 PM, Yui NARUSE wrote:

 I guess "no" is the answer?

default_internal doesn't effect on this situation.
"#{a}" is considered as ` s = a.to_s
So "no" is the answer, s's encoding depends a's encoding.

 What about "foo#{a}bar"? Would that have the same encoding result as
 "#{a}", or is the latter just a special case? (Either choice seems
 counterintuitive to me.)

"foo#{a}bar" is considered as ` s = "foo"; s.concat(a.to_s); s.concat("bar").
So the resulted s's encoding depends "foo".

Helpful to know, thanks.

Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0Like0Like0