Project

General

Profile

Actions

Bug #7201

closed

Setting default_external affects STDIN encoding but default_internal does not

Added by brixen (Brian Shirai) over 11 years ago. Updated over 11 years ago.

Status:
Rejected
Target version:
ruby -v:
ruby 1.9.3p286 (2012-10-12 revision 37165) [x86_64-darwin10.8.0]
Backport:
[ruby-core:48132]

Description

Changing Encoding.default_external changes STDIN.external_encoding, but changing Encoding.default_internal does not change STDIN.internal_encoding.

STDOUT and STDERR internal/external encodings are not changed in either case and are always nil.

Is this a bug? See the following IRB transcript:

$ irb
1.9.3p286 :001 > Encoding.default_external
=> #Encoding:UTF-8
1.9.3p286 :002 > Encoding.default_internal
=> nil
1.9.3p286 :003 > STDIN.external_encoding
=> #Encoding:UTF-8
1.9.3p286 :004 > STDIN.internal_encoding
=> nil
1.9.3p286 :005 > Encoding.default_external = "euc-jp"
=> "euc-jp"
1.9.3p286 :006 > STDIN.external_encoding
=> #Encoding:EUC-JP
1.9.3p286 :007 > STDIN.internal_encoding
=> nil
1.9.3p286 :008 > Encoding.default_internal = "iso-8859-1"
=> "iso-8859-1"
1.9.3p286 :009 > STDIN.internal_encoding
=> nil

Thanks,
Brian

Updated by mame (Yusuke Endoh) over 11 years ago

  • Status changed from Open to Assigned
  • Assignee set to naruse (Yui NARUSE)
  • Target version set to 2.0.0

Naruse-san, could you handle this?

--
Yusuke Endoh

Updated by naruse (Yui NARUSE) over 11 years ago

  • Status changed from Assigned to Rejected

This is not a bug in 1.9.3 and 2.0.0 while I feel this behavior is not so good.
I want to change this but it will be big change, therefore I keep compatibility in near future.

Updated by brixen (Brian Shirai) over 11 years ago

Can someone please explain how the inconsistency with how the rest of IO instances would behave with transcoding is not a bug?

Thanks,
Brian

Updated by duerst (Martin Dürst) over 11 years ago

Hello Brian,

I'm not sure what the reason was for the current state, but I can easily
imagine a situation where stdin/stdout are the console and therefore in
one encoding, whereas the data a script is working on is all in another
encoding.

Regards, Martin.

Updated by naruse (Yui NARUSE) over 11 years ago

brixen (Brian Ford) wrote:

Can someone please explain how the inconsistency with how the rest of IO instances would behave with transcoding is not a bug?

This is because IO object's internal property are set when it is created.
In this case, STDIN's internal property is not changed when default_external and default_internal are set.

And in this situation, STDIN.external_encoding returns current Encoding.default_external,
so it looks as if Encoding.default_external changes STDIN.

Following are detail

= IO's internal property

An IO object has two internal properties, extenc (external encoding) and intenc (internal encoding).

When extenc and intenc are explicitly given like open("foo.txt", "r:UTF-8:ISO-8859-1"),
extenc is UTF-8 and intenc is ISO-8859-1

When extenc and intenc are not given like open("foo.txt", "r") or STDIN without -E/-U,
extenc is nil and intenc is nil.

= IO#external_encoding

If extenc is not nil, returns extenc.
If extenc is nil, returns current Encoding.default_external.

This method is to know what encoding is set on io.read.
(this had to be always return extenc...)

= IO#internal_encoding

Returns intenc.

= Conclusion

Current inconsistency is derived from IO objects' internal state and settings for conversion.
The change will need add more internal property and breaking IO#external_encoding.
I couldn't design better one yet.

Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0Like0