Feature #3418
closedIO#putc Clobbers Multi-byte Characters
Description
=begin
IO#putc claims to write a "character", when in fact it writes a byte. I assume this is for backward compatibility reasons, but as this could lead to data loss, the documentation needs clarifying. Currently, #putc doesn't require the stream to be in binmode, provide any warning of the truncation, or agree with IO#getc on the definition of "character".
open('/tmp/putc', 'w+') {|f| f.putc "\u1234"; f.rewind; f.read}
#=> "\xE1
open('/tmp/getc', 'w+'){|f| f.print "\u1234"; f.rewind; f.getc}
#=> "ሴ"
If the IO stream explicitly specifies a non-BINARY encoding, the first example fails with an Encoding::UndefinedConversionError, which is reasonable.
open('/tmp/putc', 'w+:UTF-8'){|f| f.putc "\u1234"; f.rewind; f.read}
#=> Encoding::UndefinedConversionError: "\xE1" from ASCII-8BIT to UTF-8
=end
Files