Bug #21765: stop using the C runtime _read() on Windows - Ruby - Ruby Issue Tracking System

Actions

Copy link

Bug #21765

open

stop using the C runtime _read() on Windows

Bug #21765: stop using the C runtime _read() on Windows

Added by YO4 (Yoshinao Muramatsu) 3 months ago. Updated about 2 months ago.

Status:

Assigned

Assignee:

windows

Target version:

ruby -v:

Backport:

3.2: UNKNOWN, 3.3: UNKNOWN, 3.4: UNKNOWN

[ruby-core:124029]

Description

When creating an IO instance in Windows, the default data mode is text mode.
In reality, the IO encoding conversion mechanism is not used when encoding conversion is not performed. Instead, the CRLF conversion provided by the C runtime's _read() is used.
This is explicitly for speed.
https://bugs.ruby-lang.org/issues/6401#note-4

As a trade-off, SET_BINARY_MODE(fptr) and SET_BINARY_MODE_WITH_SEEK_CUR(fptr) are used in various places within io.c, altering the state of the file descriptor.
This made the flow of operations difficult to understand and changes hard to implement, especially for developers on other platforms.

Additionally, the issues I recently reported were discovered while verifying the impact of modifying the CRLF conversion to utilize the encoding conversion mechanism.
#21691 On Windows some of binary read functions of IO are not functional
#21687 IO＃pos goes wrong after EOF character(ctrl-z) met
#21634 Combining read(1) with eof? causes dropout of results unexpectedly on Windows.

These issues arise because data read into the rbuf does not match the stream due to newline conversion, or because the buffer end and file position do not align when CTRLZ is detected.
As a fix for Bug #21687, I created PR #15216. However, this relies on the internal behavior of the C runtime's _read() function, and it seems there is no way to avoid this dependency.

I propose removing the use of C runtime _read().

Reason for Proposal

The mismatch between rbuf and stream contents complicates io_unread() and makes maintenance difficult.
Changing the O_BINARY/O_TEXT state of the file descriptor in various places hinders understanding of the behavior and makes modifications difficult.

Two methods to remove C runtime _read() while maintaining current behavior

Interpret CRLF and CTRLZ when reading rbuf within io.c.
Interpret CRLF and CTRLZ within the encoding conversion framework.

My initial idea was to implement the second, using encoding conversion.
However, this internally changes the read operation from rbuf to cbuf, resulting in a change to the behavior of ungetc.
The proposal in Bug #21682 attempted to generalize this change to minimize its impact.
https://bugs.ruby-lang.org/issues/21682

This issue proposes the first method, crlf conversion during rbuf read.

Problems caused by inconsistencies between the rbuf and stream contents are avoided, and io_unread() becomes the same as on other platforms.

Compared to implementing it as an encoding conversion, the advantage is that there is no change in behavior.
On the other hand, since each read method in io.c requires individual handling, using encoding conversion results in more localized changes.

Actions

Copy link

Also available in: PDF Atom

Project

General

Profile

Ruby

Custom queries

Bug #21765

stop using the C runtime _read() on Windows

Updated by YO4 (Yoshinao Muramatsu) 3 months ago Actions
Copy link
#1 [ruby-core:124030]

Updated by YO4 (Yoshinao Muramatsu) 3 months ago Actions
Copy link
#2 [ruby-core:124057]

Updated by hsbt (Hiroshi SHIBATA) about 2 months ago Actions
Copy link
#3 [ruby-core:124451]

Project

General

Profile

Ruby

Custom queries

Bug #21765

stop using the C runtime _read() on Windows

Updated by YO4 (Yoshinao Muramatsu) 3 months ago ActionsCopy link #1 [ruby-core:124030]

Updated by YO4 (Yoshinao Muramatsu) 3 months ago ActionsCopy link #2 [ruby-core:124057]

Updated by hsbt (Hiroshi SHIBATA) about 2 months ago ActionsCopy link #3 [ruby-core:124451]

Updated by YO4 (Yoshinao Muramatsu) 3 months ago Actions
Copy link
#1 [ruby-core:124030]

Updated by YO4 (Yoshinao Muramatsu) 3 months ago Actions
Copy link
#2 [ruby-core:124057]

Updated by hsbt (Hiroshi SHIBATA) about 2 months ago Actions
Copy link
#3 [ruby-core:124451]