Project

General

Profile

Actions

Bug #20869

closed

IO buffer handling is inconsistent when seeking

Added by javanthropus (Jeremy Bopp) 8 days ago. Updated about 20 hours ago.

Status:
Closed
Assignee:
-
Target version:
-
ruby -v:
ruby 3.3.4 (2024-07-09 revision be1089c8ec) [x86_64-linux]
[ruby-core:119741]

Description

When performing any of the seek based operations on IO (IO#seek, IO#pos=, or IO#rewind), the read buffer is inconsistently cleared:

require 'tempfile'

Tempfile.open do |f|
  f.write('0123456789')
  f.rewind

  # Calling #ungetbyte as the first read buffer
  # operation uses a buffer that is preserved during
  # seek operations
  f.ungetbyte(97)
  # Byte buffer will not be cleared
  f.seek(2, :SET)

  f.getbyte       # => 97
end

Tempfile.open do |f|
  f.write('0123456789')
  f.rewind

  # Calling #getbyte before #ungetbyte uses a
  # buffer that is not preserved when seeking
  f.getbyte
  f.ungetbyte(97)
  # Byte buffer will be cleared
  f.seek(2, :SET)

  f.getbyte       # => 50
end

Similar behavior happens when reading characters:

require 'tempfile'

Tempfile.open do |f|
  f.write('0123456789')
  f.rewind

  # Calling #ungetc as the first read buffer
  # operation uses a buffer that is preserved during
  # seek operations
  f.ungetc('a')
  # Character buffer will not be cleared
  f.seek(2, :SET)

  f.getc       # => 'a'
end

Tempfile.open do |f|
  f.write('0123456789')
  f.rewind

  # Calling #getc before #ungetc uses a
  # buffer that is not preserved when seeking
  f.getc
  f.ungetc('a')
  # Character buffer will be cleared
  f.seek(2, :SET)

  f.getc       # => '2'
end

When transcoding, however, the character buffer is never cleared when seeking:

require 'tempfile'

Tempfile.open(encoding: 'utf-8:utf-16le') do |f|
  f.write('0123456789')
  f.rewind

  f.ungetc('a'.encode('utf-16le'))
  # Character buffer will not be cleared
  f.seek(2, :SET)

  f.getc       # => 'a'.encode('utf-16le')
end

Tempfile.open(encoding: 'utf-8:utf-16le') do |f|
  f.write('0123456789')
  f.rewind

  f.getc
  f.ungetc('a'.encode('utf-16le'))
  # Character buffer will not be cleared
  f.seek(2, :SET)

  f.getc       # => 'a'.encode('utf-16le')
end

I would expect the buffers to be cleared in all cases except possibly when the seek operation doesn't actually move the file pointer such as when calling IO#pos or IO#seek(0, :CUR). The inconsistent behavior demonstrated here is a problem regardless though.

Actions

Also available in: Atom PDF

Like0
Like0Like0Like1Like1Like0Like0Like0Like0Like0Like0Like0Like0