Project

General

Profile

Actions

Bug #21686

open

In combination with IO#ungetbyte, the write position may become unpredictable.

Bug #21686: In combination with IO#ungetbyte, the write position may become unpredictable.

Added by YO4 (Yoshinao Muramatsu) about 18 hours ago. Updated about 5 hours ago.

Status:
Open
Assignee:
-
Target version:
-
ruby -v:
ruby 3.4.1 (2024-12-25 revision 48d4efcb85) +PRISM [x64-mingw-ucrt]
[ruby-core:123804]

Description

In the current implementation, using IO#ungetbyte can cause IO#pos to become negative.
Writing to the same file descriptor in this state will result in unexpected write positions.

require 'tempfile'

Tempfile.open do |f|
  f.write("0123456789")
  f.rewind
  f.getbyte        # rbuf filled
  f.ungetbyte("-") # f.pos => 0
  f.ungetbyte("-") # f.pos => -1
  f.write("x")     # seek to -1 failed in io_unread
  f.rewind
  p f.read         # => "0123456789x", wrote at the end, questionable
end

When internal io_unread() in io.c is called to prepare for a write operation while the actual file position is advancing due to reading,
lseek fails because it attempts to reset to a negative file position.

IO#seek has a similar issue.

Tempfile.open do |f|
  f.write("0123456789")
  f.rewind
  f.getbyte        # rbuf filled
  f.ungetbyte("-") # f.pos => 0
  f.ungetbyte("-") # f.pos => -1
  f.seek(0, File::SEEK_CUR)
  p f.pos          # => 10, this points to the last of rbuf.
end

I am preparing the PR, but I understand that there is discussion about the position where the write occurs and the possibility of the pos becoming negative.

Updated by YO4 (Yoshinao Muramatsu) about 18 hours ago Actions #1 [ruby-core:123805]

I made PR #15204

Changed and Unchanged

The behavior of this PR is listed below.
Since IO#ungetc has a different file position behavior when combined with encoding conversion, IO#getbyte is used for explanation.

  1. IO#ungetbyte moves the file position. This is to restore the file position as much as possible after executing IO#getbyte.

  2. After IO#ungetbyte is invoked, the file position may become negative.
    This is to avoid situations where the file position does not advance even after multiple getbyte calls, by restricting it to non-negative values.

  3. IO#pos always retains the buffer. Flushing the buffer when the file position is negative will not allow the next call to return the same file position.

  4. IO#seek may raise an exception if self.seek(0, File::SEEK_CUR)* is called. This occurs when the file position is negative.

Impact of Behavior Changes

Regarding IO#pos, since the typical use case for IO#ungetbyte is to push back the same data that was read, buffer retention or reset will yield identical results in this scenario.
The change to IO#seek is considered acceptable because, in the above typical use case, the file position never becomes negative, and the previous behavior is deemed of no practical value.

Updated by javanthropus (Jeremy Bopp) about 16 hours ago Actions #2 [ruby-core:123807]

#20919 may be related. If the PR for this issue is accepted, it likely determines how that issue should be addressed. There's already a PR that is several months old associated with it.

Updated by YO4 (Yoshinao Muramatsu) about 5 hours ago Actions #3 [ruby-core:123810]

As proposed in this issue, resolving this problem required determining a solution strategy across multiple methods with considering compatibility.

While I'm only an external contributor and not a core team member, I'll boldly say that the patch appears to merely mask the problem.
From my perspective, that PR seemed likely to cause issues by resetting pointers while leaving the encoding converter's state untouched. However, I hesitated to speak up because I wanted to find the actual conditions where it would cause problems.
As reviewers, they were required to be even more precise in their statements, and when solutions were unclear, they likely couldn't offer careless opinions.

We still need to explore another solution, like javanthropus's proposals in Bug #20889.
I hope my suggestion helps clarify the issues and move the case forward.

Actions

Also available in: PDF Atom