Bug #8342

IO.readlines ignores Encoding.default_internal if Encoding.default_external is ASCII-8BIT

Added by Leo Cassarani 12 months ago. Updated 11 months ago.

[ruby-core:54656]
Status:Closed
Priority:Normal
Assignee:Yui NARUSE
Category:M17N
Target version:2.1.0
ruby -v:1.9.3 Backport:1.9.3: UNKNOWN, 2.0.0: UNKNOWN

Description

Under normal circumstances, IO.readlines will transcode from Encoding.defaultexternal to Encoding.defaultinternal:

File.open('hi', 'w') { |f| f.puts "hello\n" }
Encoding.defaultexternal = Encoding::USASCII
Encoding.defaultinternal = Encoding::UTF8
puts IO.readlines('hi').first.encoding
#=> UTF-8

However, when Encoding.defaultexternal is set to ASCII-8BIT, IO.readlines will always use ASCII-8BIT, regardless of what Encoding.defaultinternal is set to:

File.open('hi', 'w') { |f| f.puts "hello\n" }
Encoding.defaultexternal = Encoding::ASCII8BIT
Encoding.defaultinternal = Encoding::UTF8
puts IO.readlines('hi').first.encoding
#=> ASCII-8BIT

Using IO#gets instead of IO.readlines will produce the same behaviour.

Associated revisions

Revision 40610
Added by Yui NARUSE 12 months ago

  • io.c (rbioextintto_encs): ignore internal encoding if external encoding is ASCII-8BIT. [Bug #8342]

History

#1 Updated by Nobuyoshi Nakada 12 months ago

  • Category set to M17N
  • Status changed from Open to Assigned
  • Assignee set to Yui NARUSE
  • Target version set to 2.1.0

Seems intended behavior to me.

#2 Updated by Yui NARUSE 12 months ago

  • Status changed from Assigned to Rejected

If external encoding is ASCII-8BIT, the input content is considered as binary.
It is out of text encoding conversion and its encoding kept as ASCII-8BIT even if default_internal is set.

#3 Updated by Leo Cassarani 12 months ago

Thanks naruse. However, this seems inconsistent with the way encodings are handled for individual IO instances. For example:

io = File.open('hi', :encoding => "ascii-8bit:utf-16")
puts io.gets.encoding

=> UTF-16

This happens even if Encoding.default_external is set to ASCII-8BIT before opening the file.

#4 Updated by Yui NARUSE 12 months ago

  • Status changed from Rejected to Assigned

leocassarani (Leo Cassarani) wrote:

Thanks naruse. However, this seems inconsistent with the way encodings are handled for individual IO instances. For example:

io = File.open('hi', :encoding => "ascii-8bit:utf-16")
puts io.gets.encoding

=> UTF-16

This happens even if Encoding.default_external is set to ASCII-8BIT before opening the file.

That side sounds buggy

#5 Updated by Yui NARUSE 11 months ago

  • Status changed from Assigned to Closed
  • % Done changed from 0 to 100

This issue was solved with changeset r40610.
Leo, thank you for reporting this issue.
Your contribution to Ruby is greatly appreciated.
May Ruby be with you.


  • io.c (rbioextintto_encs): ignore internal encoding if external encoding is ASCII-8BIT. [Bug #8342]

Also available in: Atom PDF