Project

General

Profile

Actions

Bug #18995

open

IO#set_encoding sometimes set an IO's internal encoding to the default external encoding

Added by javanthropus (Jeremy Bopp) 3 months ago. Updated 3 months ago.

Status:
Open
Priority:
Normal
Assignee:
-
Target version:
-
ruby -v:
ruby 3.1.2p20 (2022-04-12 revision 4491bb740a) [x86_64-linux]
[ruby-core:109842]

Description

This script demonstrates the behavior:

def show(io)
  printf(
    "external encoding: %-25p  internal encoding: %-25p\n",
    io.external_encoding,
    io.internal_encoding
  )
end

Encoding.default_external = 'iso-8859-1'
Encoding.default_internal = 'iso-8859-2'

File.open('/dev/null') do |f|
  f.set_encoding('utf-8', nil)
  show(f)                             # f.internal_encoding is iso-8859-2, as expected

  f.set_encoding('utf-8', 'invalid')
  show(f)                             # f.internal_encoding is now iso-8859-1!

  Encoding.default_external = 'iso-8859-3'
  Encoding.default_internal = 'iso-8859-4'
  show(f)                             # f.internal_encoding is now iso-8859-3!
end

In the 1st case, we see that the IO's internal encoding is set to the current setting of Encoding.default_internal. In the 2nd case, the IO's internal encoding is set to Encoding.default_external instead. The 3rd case is more interesting because it shows that the IO's internal encoding is actually following the current setting of Encoding.default_external. It didn't just copy it when #set_encoding was called. It changes whenever Encoding.default_external changes.

What should the correct behavior be?

Updated by javanthropus (Jeremy Bopp) 3 months ago

Can anyone confirm that this is a bug and not a misunderstanding? It looks like the changes to fix this will require a fair bit of refactoring, and there don't yet appear to be any tests around the various cases for arguments to IO#set_encoding where IO#internal_encoding and IO#external_encoding are checked. I found tests around various ways of opening files and pipes with encoding arguments which do check the resulting internal and external encodings of the IO object, but none of those test these corner cases.

Actions

Also available in: Atom PDF

Like0
Like0