Project

General

Profile

Actions

Bug #19378

open

Windows: Use less syscalls for faster require of big gems

Added by aidog (Andi Idogawa) over 1 year ago. Updated about 1 year ago.

Status:
Assigned
Assignee:
Target version:
-
[ruby-core:112045]

Description

Hello 🙂

Problem

require is slow on windows for big gems. (example: require 'gtk3'=> 3 seconds+). This is a problem for people who want to make cross platform GUI apps with ruby.

Possible Reason

As touched on in #15797 it seems like require uses realpath, which is emulated on windows. It checks every parent directory. The same syscalls run many times.

Testfile

C:\tmp\speedtest\testrequire.rb:

require __dir__ + "/helloworld1.rb"
require __dir__ + "/helloworld2.rb"
ruby --disable-gems C:\tmp\speedtest\testrequire.rb

Syscalls per File/Directory:

  1. CreateFile
  2. QueryInformationVolume
  3. QueryIdInformation
  4. QueryAllInformationFile
  5. QueryNameInformationFile
  6. QueryNameInformationFile
  7. QueryNormalizedNameInformationFile
  8. CloseFile

Files/Directories checked

  1. C:\tmp
  2. C:\tmp\speedtest
  3. C:\tmp\speedtest\helloworld1.rb
  4. C:\tmp
  5. C:\tmp\speedtest
  6. C:\tmp\speedtest\helloworld2.rb

For two required files Ruby had to do 8*6 = 48 syscalls.
The syscalls orginate from rb_w32_reparse_symlink_p / lstat

Rubygems live in subfolders with 9+ parts: "C:\Ruby32-x64\lib\ruby\gems\3.2.0\gems\glib2-4.0.8\lib\glib2\variant.rb"
Each file takes 8 * 9 = 72+ calls. For variant.rb it is 80 calls.
The result for the syscalls don't change in such a short time, so it should be possible to cache it.

With require_relative it's twice as many calls.

Other testcases

Same result:

File.realpath __dir__ + "/helloworld1.rb"
File.realpath __dir__ + "/helloworld2.rb"
File.stat __dir__ + "/helloworld1.rb"
File.stat __dir__ + "/helloworld2.rb"

It does not happen in $LOAD_PATH.resolve_feature_path(dir + "/helloworld1.rb")

Request

Would it be possible to cache the stat calls when using require?
I tried to implement a cache inside the ruby source code, but failed.
If not, is there now a way to combine ruby files into one?

I previously talked about require here: YJIT: Windows support lacking.

How to reproduce

Ruby versions: At least 3.0+, most likely older ones too.
Tested using Ruby Installer 3.1 and 3.2.
Procmon Software by Sysinternals


Files

windows-no-realpath-require.patch (992 Bytes) windows-no-realpath-require.patch test to avoid repeated syscalls aidog (Andi Idogawa), 01/30/2023 03:10 AM
windows-revert-79a4484a.patch (5.42 KB) windows-revert-79a4484a.patch joshc (Josh C), 02/24/2023 01:40 AM

Updated by aidog (Andi Idogawa) over 1 year ago

Thanks to the new windows build docs by ioquatix, I made a test patch to check how much faster it would be if some of the repeated syscalls on the folders (c:/tmp/, c:/tmp/speedtest, gems and so on) are avoided:

tzinfo: 0.8s to 0.3s
gtk3: 2.8s to 2.5s (I see another similar issue inside the gem C code)

Windows has GetFinalPathNameByHandleW since Vista, which some other projects use for realpath. Would it work for Ruby?

Updated by nobu (Nobuyoshi Nakada) over 1 year ago

  • Status changed from Open to Assigned
  • Assignee set to windows

Updated by joshc (Josh C) over 1 year ago

I've also noticed a significant increase in file IO events (as reported by procmon) due to https://github.com/ruby/ruby/commit/79a4484a072e9769b603e7b4fbdb15b1d7eccb15 introduced in Ruby 3.1.0. The code tries to prevent the same file from being loaded twice by calling rb_realpath_internal to see if the realpath has already been loaded. This is a problem on systems like Windows that use Ruby's emulated realpath, especially when there are deeply nested directories. I've attached a revert patch. It'd be great to use GetFinalPathNameByHandleW and avoid the emulate code.

Updated by jeremyevans0 (Jeremy Evans) over 1 year ago

joshc (Josh C) wrote in #note-3:

I've attached a revert patch.

I think the only way we would revert 79a4484a072e9769b603e7b4fbdb15b1d7eccb15 is if someone can come up with an alternative approach to fixing Bug #17885.

It'd be great to use GetFinalPathNameByHandleW and avoid the emulate code.

If you mean to use this on Windows for the internals of File#realpath, I think we would be open to a backwards compatible patch for that, but @usa (Usaku NAKAMURA) would need to decide as he maintains the mswin64 platform.

Updated by MSP-Greg (Greg L) about 1 year ago

Just to be clear, this issue affects all Windows MRI platforms, so both mswin64 and mingw32 (mingw & ucrt builds) are affected.

Actions

Also available in: Atom PDF

Like0
Like1Like0Like0Like0Like0Like0