



Feature #3418


IO#putc Clobbers Multi-byte Characters

Added by runpaint (Run Paint Run Run) over 14 years ago. Updated almost 14 years ago.

Target version:


IO#putc claims to write a "character", when in fact it writes a byte. I assume this is for backward compatibility reasons, but as this could lead to data loss, the documentation needs clarifying. Currently, #putc doesn't require the stream to be in binmode, provide any warning of the truncation, or agree with IO#getc on the definition of "character".

open('/tmp/putc', 'w+') {|f| f.putc "\u1234"; f.rewind;}
#=> "\xE1

open('/tmp/getc', 'w+'){|f| f.print "\u1234"; f.rewind; f.getc}
#=> "ሴ"

If the IO stream explicitly specifies a non-BINARY encoding, the first example fails with an Encoding::UndefinedConversionError, which is reasonable.

open('/tmp/putc', 'w+:UTF-8'){|f| f.putc "\u1234"; f.rewind;}
#=> Encoding::UndefinedConversionError: "\xE1" from ASCII-8BIT to UTF-8


io.c-putc.patch (1.25 KB) io.c-putc.patch runpaint (Run Paint Run Run), 06/10/2010 07:15 AM
io.c-putc.patch (1.1 KB) io.c-putc.patch runpaint (Run Paint Run Run), 06/10/2010 07:18 AM
Actions #1

Updated by matz (Yukihiro Matsumoto) over 14 years ago


In message "Re: [ruby-core:30697] [Bug #3418] IO#putc Clobbers Multi-byte Characters"
on Thu, 10 Jun 2010 05:49:55 +0900, Run Paint Run Run writes:

|IO#putc claims to write a "character", when in fact it writes a byte. I assume this is for backward compatibility reasons, but as this could lead to data loss, the documentation needs clarifying.

Agreed. The behavior is intentional, the term "character" in the
documentation means a byte in 8bit ascii, not to apart from old
putc(3) function in the C library. So this one is a documentation bug
at most.



Actions #2

Updated by runpaint (Run Paint Run Run) over 14 years ago

Thanks. Patch attached.

Actions #3

Updated by runpaint (Run Paint Run Run) over 14 years ago

Drat. Wrong file; try this one.

Actions #4

Updated by matz (Yukihiro Matsumoto) over 14 years ago


In message "Re: [ruby-core:30701] [Bug #3418] IO#putc Clobbers Multi-byte Characters"
on Thu, 10 Jun 2010 07:18:58 +0900, Run Paint Run Run writes:

|File io.c-putc.patch added

Thank you for the patch. I will apply the patch, except for examples
for multi-byte characters, since I want to make it implementation



Actions #5

Updated by matz (Yukihiro Matsumoto) over 14 years ago

  • Status changed from Open to Closed
  • % Done changed from 0 to 100

This issue was solved with changeset r28243.
Run Paint, thank you for reporting this issue.
Your contribution to Ruby is greatly appreciated.
May Ruby be with you.


Actions #6

Updated by mame (Yusuke Endoh) over 14 years ago

  • Status changed from Closed to Open
  • Assignee set to naruse (Yui NARUSE)


I agree that this is an implementation detail, but I also expect IO#putc
to handle normal character, because IO#getc behaves so:

$ cat t.txt

$ ruby19 -e 'open("t.txt") {|f| p f.getc }'

$ ruby19 -e 'open("t.txt", "w") {|f| f.putc ?あ }'

$ ruby19 -e 'open("t.txt") {|f| p }'

IO#putbyte would be needed for the byte-oriented purpose.
I move this ticket to 1.9.x feature request.

Yusuke Endoh

Actions #7

Updated by shyouhei (Shyouhei Urabe) over 14 years ago

  • Status changed from Open to Assigned



Actions #8

Updated by naruse (Yui NARUSE) over 14 years ago

  • Status changed from Assigned to Closed

This issue was solved with changeset r29447.
Run Paint, thank you for reporting this issue.
Your contribution to Ruby is greatly appreciated.
May Ruby be with you.



Also available in: Atom PDF
