Windows bug with File.pos
On Windows since Ruby 1.9.3p125 there have been issues with File.pos and File.readline. Ruby 1.9.3p0 does not have this issue. I have created the following test:
t = make_tempfile
random = Random.new(1234)
open(t.path, "w") do |f|
i = 0
lines = open(t.path,'r').read.split("\n")
open(t.path, "r") do |f|
assert_equal lines[i], f.readline.chomp
i += 1
If you comment out the f.pos line this test case will pass. I originally submitted issue #6179 but the fixes applied there made things better but did not complete solve the problem. I apologize for the test case but it requires many lines with newlines to reproduce.
Updated by h.shirosaki (Hiroshi Shirosaki) almost 8 years ago
I confirmed the issue. Thanks for your test case.
If the end of reading buffer is CR, io_unread() needs to unread one more byte for CR.
I created a patch and a simplified test case for that.
Updated by luislavena (Luis Lavena) almost 8 years ago
jmthomas (Jason Thomas) wrote:
Is there some reason that the file reading got so messed up between 193p0 and now? Was there a refactor / rewrite of this fundamental operation?
Short answer: yes, there was a refactor of IO on Windows that lead a speed increase in both writing and reading big files. Seems there are corner cases that weren't covered by tests.
Long answer: both 1.9.2 and 1.9.3-p0 suffered from really slow IO reading and writing of files on Windows. This was caused primarily due newline conversion was performed always, even if was no required or the content already contained newlines.
The refactoring solved that issue and covered most of the cases exposed by tests boosting general IO operations on Windows.
However, there are cases like the one you exposed weren't covered by tests and thus, failed to get solved properly.
This refactor was introduced in 1.9.3 considering there will be another full year until Ruby 2.0 gets released. Since 1.9.2 Ruby has been getting slower and slower on Windows.
Instead of waiting to 2.0 to find and fix all those performance issues, we decided to start making a more usable Ruby today.
Hope that helps to understand the reasoning of these changes.
Updated by jonforums (Jon Forums) almost 8 years ago
As just one example, due to the Windows IO refactoring led primarily by Shirosaki-san, read performance improved from ~18.5s on 1.9.2/1.9.3p0 to ~1.4s on 1.9.3p125+ with one of my micro-benchmarks
Yes, you read that right...18.5s to 1.4s ;)
Updated by Anonymous almost 8 years ago
- Status changed from Assigned to Closed
- % Done changed from 0 to 100
This issue was solved with changeset r35594.
Jason, thank you for reporting this issue.
Your contribution to Ruby is greatly appreciated.
May Ruby be with you.
io.c (io_unread): fix IO#pos with mode 'r' bug on Windows.
If the end of reading buffer is CR, io_unread() needs to unread one
[ruby-core:44874] [Bug #6401]
add a test for above.