Bug #15876: 1.to_s.encoding != Encoding.default_internal - Ruby - Ruby Issue Tracking System

Actions

Copy link

Bug #15876

closed

1.to_s.encoding != Encoding.default_internal

Bug #15876: 1.to_s.encoding != Encoding.default_internal

Added by grosser (Michael Grosser) about 7 years ago. Updated almost 7 years ago.

Status:

Closed

Assignee:

Target version:

ruby -v:

2.6.3

Backport:

2.4: UNKNOWN, 2.5: UNKNOWN, 2.6: UNKNOWN

[ruby-core:92842]

Description

I ran into strange looking test output when I compared .to_s with an expected text, saying that the encoding was different, which is confusing/annoying especially to users that don't know how encodings work in ruby.
1.to_s.encoding should be the same as "".encoding

Updated by shevegen (Robert A. Heiler) about 7 years ago Actions
Copy link
#1 [ruby-core:92843]

which is confusing/annoying especially to users that don't know
how encodings work in ruby.

I personally finally switched into UTF-8 (oddly enough, primarily due to emoji and
unicode-symbols that can be used for simple indications both on the commandline and
www), but I think one problem (for me) was from ruby 1.8.x to later ruby versions
that there was not that much documentation available.

Judging from your comment encoding may still pose a problem for some ruby users (or
potentially new ruby users).

Some time ago, I think, jeremy evans wrote a document about symbols, which was added
(my apologies if I misremember). If anyone feels like writing some document about
encoding in ruby, and how to deal with it ... :) (could be in wiki-style or perhaps
gist-github or some other place; I am in no way suggesting that only a single
person should do so, it could be a collaborative effort).

To the issue at hand, I just tested in irb:

1.to_s.encoding  #should be the same as "".encoding # => #<Encoding:US-ASCII>
"".encoding # => #<Encoding:UTF-8>

This is indeed a little surprising (to me). There may be valid reasons for this,
perhaps default external encoding, or something like this, but I can see why
people may be confused about it. Actually what surprises me is that .to_s on
the number leads to US-ASCII encoding by default.

I think looking back when I used an ISO-encoding, the most surprising result I
had encountered was actually in regards to regexp-engine and encodings used
there. I do not remember exactly how I found it, but I think I reported it back
then; still not entirely sure how it came, but regexes may also be an area where
users may be a little bit confused - so documentation may be of some help.

Updated by mame (Yusuke Endoh) about 7 years ago Actions
Copy link
#2 [ruby-core:92845]

@grosser (Michael Grosser), could you elaborate your problem? I cannot reproduce the warning. What warning did you see? And how?

s1 = 1.to_s
p s1.encoding #=> #<Encoding:US-ASCII>

s2 = "1"
p s2.encoding #=> #<Encoding:UTF-8>

p s1 == s2 #=> true with no warning

Updated by duerst (Martin Dürst) about 7 years ago Actions
Copy link
#3 [ruby-core:92846]

@mame (Yusuke Endoh):

What @grosser (Michael Grosser) is saying is that

p s1.encoding == s2.encoding #=> false

but he expects the result to be true. But you are right that what counts is the equality of the strings, not the encodings.

Updated by mame (Yusuke Endoh) about 7 years ago Actions
Copy link
#4 [ruby-core:92847]

@grosser (Michael Grosser) said

I ran into strange looking test output when I compared .to_s with an expected text, saying that the encoding was different

I thought that some string-comparison assertions (maybe attributed to an external testing framework?) emitted a spurious warning like "the encoding was different" or something.

Updated by naruse (Yui NARUSE) about 7 years ago Actions
Copy link
#5 [ruby-core:92913]

Status changed from Open to Feedback

What is the problem you are actually troubled with?

If it is just a testing problem, I feel it should just use correct assertions.
But if there's a frequent pitfall, I may reconsider it.

Updated by Hanmac (Hans Mackowiak) about 7 years ago Actions
Copy link
#6 [ruby-core:92914]

There is Encoding.compatible? which might help to check if two strings/symbols has a common encoding

@naruse (Yui NARUSE) i don't know if you are the right contact person for this, but is there a way to see if two encoding objects are compatible or can that only be checked on the string?

Updated by Eregon (Benoit Daloze) about 7 years ago Actions
Copy link
#7 [ruby-core:92917]

Hanmac (Hans Mackowiak) wrote:

is there a way to see if two encoding objects are compatible or can that only be checked on the string?

Encoding.compatible? can take Encoding arguments too:

> Encoding.compatible?(Encoding::UTF_8, Encoding::US_ASCII) 
=> #<Encoding:UTF-8>

Updated by jeremyevans0 (Jeremy Evans) almost 7 years ago Actions
Copy link
#8

Status changed from Feedback to Closed

Actions

Copy link

Also available in: PDF Atom

Project

General

Profile

Ruby

Custom queries

Bug #15876

1.to_s.encoding != Encoding.default_internal

Updated by shevegen (Robert A. Heiler) about 7 years ago Actions
Copy link
#1 [ruby-core:92843]

Updated by mame (Yusuke Endoh) about 7 years ago Actions
Copy link
#2 [ruby-core:92845]

Updated by duerst (Martin Dürst) about 7 years ago Actions
Copy link
#3 [ruby-core:92846]

Updated by mame (Yusuke Endoh) about 7 years ago Actions
Copy link
#4 [ruby-core:92847]

Updated by naruse (Yui NARUSE) about 7 years ago Actions
Copy link
#5 [ruby-core:92913]

Updated by Hanmac (Hans Mackowiak) about 7 years ago Actions
Copy link
#6 [ruby-core:92914]

Updated by Eregon (Benoit Daloze) about 7 years ago Actions
Copy link
#7 [ruby-core:92917]

Updated by jeremyevans0 (Jeremy Evans) almost 7 years ago Actions
Copy link
#8

Project

General

Profile

Ruby

Custom queries

Bug #15876

1.to_s.encoding != Encoding.default_internal

Updated by shevegen (Robert A. Heiler) about 7 years ago ActionsCopy link #1 [ruby-core:92843]

Updated by mame (Yusuke Endoh) about 7 years ago ActionsCopy link #2 [ruby-core:92845]

Updated by duerst (Martin Dürst) about 7 years ago ActionsCopy link #3 [ruby-core:92846]

Updated by mame (Yusuke Endoh) about 7 years ago ActionsCopy link #4 [ruby-core:92847]

Updated by naruse (Yui NARUSE) about 7 years ago ActionsCopy link #5 [ruby-core:92913]

Updated by Hanmac (Hans Mackowiak) about 7 years ago ActionsCopy link #6 [ruby-core:92914]

Updated by Eregon (Benoit Daloze) about 7 years ago ActionsCopy link #7 [ruby-core:92917]

Updated by jeremyevans0 (Jeremy Evans) almost 7 years ago ActionsCopy link #8

Updated by shevegen (Robert A. Heiler) about 7 years ago Actions
Copy link
#1 [ruby-core:92843]

Updated by mame (Yusuke Endoh) about 7 years ago Actions
Copy link
#2 [ruby-core:92845]

Updated by duerst (Martin Dürst) about 7 years ago Actions
Copy link
#3 [ruby-core:92846]

Updated by mame (Yusuke Endoh) about 7 years ago Actions
Copy link
#4 [ruby-core:92847]

Updated by naruse (Yui NARUSE) about 7 years ago Actions
Copy link
#5 [ruby-core:92913]

Updated by Hanmac (Hans Mackowiak) about 7 years ago Actions
Copy link
#6 [ruby-core:92914]

Updated by Eregon (Benoit Daloze) about 7 years ago Actions
Copy link
#7 [ruby-core:92917]

Updated by jeremyevans0 (Jeremy Evans) almost 7 years ago Actions
Copy link
#8