Project

General

Profile

Actions

Bug #9153

closed

IO#flush causes unnecessary fsync on Windows

Added by snaury (Alexey Borzenkov) over 10 years ago. Updated about 10 years ago.

Status:
Closed
Target version:
-
[ruby-core:58570]

Description

On Windows calling IO#flush is effectively identical to calling IO#fsync, i.e. contents of the file are committed to disk platters instead of just being flushed. I traced it back to bug #776 where the original "bug" was worked around by forcing fsync to happen on flushes. Unfortunately due to this change IO#flush becomes unusable, as fsync are very expensive, e.g. on one of my machines I had fsync taking up to 150ms and I heard stories of machines where fsync takes on the order of 2000ms.

Originally I discovered this problem where my script would print out a couple hundred lines using Kernel#p, and to my astonishment when I redirected to a file script started taking several seconds to complete.

The problem with original fix (adding fsync during flush) is that there was no issue to begin with. It's not even due to Windows per se why file size is not updated, it's due to how NTFS driver is optimized to not update file size (in the directory entry) until the file is closed. Please read this blog post on details about what's going on: http://blogs.msdn.com/b/oldnewthing/archive/2011/12/26/10251026.aspx

What I mean is that IO#flush without fsync properly flushes all the data to the file, you can read all this data from another process, the only thing that is not updated is directory entry metadata (until the file is closed), which is by design, it's how it's supposed to work on Windows with NTFS filesystem. The workaround (i.e. fsync) working is more of an accident, it's just when OS is forced to write all that data to disk it currently tries to create a consistent picture and updates directory metadata as well, there's nothing saying that it would keep doing that in the future. Worst of all is that original bug was about temporary files, and fsync during IO#flush forces them to be written to disk, even if they are short lived.

Please remove fsync from IO#flush on Windows. You shouldn't workaround correct Windows behavior and make it unbearably slow. Instead, people need to learn how filesystems work on Windows and learn to close files if they are finished writing to them and really need directory metadata to be updated (however most of the time people shouldn't care about directory metadata like file size, it's just some arbitrary cached value and is not necessarily true all of the time).


Files

no-fsync-on-flush.patch (799 Bytes) no-fsync-on-flush.patch snaury (Alexey Borzenkov), 11/25/2013 11:17 PM
Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0Like0