Bug #19378
open
Windows: Use less syscalls for faster require of big gems
Description
Hello 🙂
Problem¶
require is slow on windows for big gems. (example: require 'gtk3'=> 3 seconds+). This is a problem for people who want to make cross platform GUI apps with ruby.
Possible Reason¶
As touched on in #15797 it seems like require uses realpath, which is emulated on windows. It checks every parent directory. The same syscalls run many times.
Testfile¶
C:\tmp\speedtest\testrequire.rb:
require __dir__ + "/helloworld1.rb"
require __dir__ + "/helloworld2.rb"
ruby --disable-gems C:\tmp\speedtest\testrequire.rb
Syscalls per File/Directory:¶
- CreateFile
- QueryInformationVolume
- QueryIdInformation
- QueryAllInformationFile
- QueryNameInformationFile
- QueryNameInformationFile
- QueryNormalizedNameInformationFile
- CloseFile
Files/Directories checked¶
- C:\tmp
- C:\tmp\speedtest
- C:\tmp\speedtest\helloworld1.rb
- C:\tmp
- C:\tmp\speedtest
- C:\tmp\speedtest\helloworld2.rb
For two required files Ruby had to do 8*6 = 48 syscalls.
The syscalls orginate from rb_w32_reparse_symlink_p / lstat
Rubygems live in subfolders with 9+ parts: "C:\Ruby32-x64\lib\ruby\gems\3.2.0\gems\glib2-4.0.8\lib\glib2\variant.rb"
Each file takes 8 * 9 = 72+ calls. For variant.rb it is 80 calls.
The result for the syscalls don't change in such a short time, so it should be possible to cache it.
With require_relative it's twice as many calls.
Other testcases¶
Same result:
File.realpath __dir__ + "/helloworld1.rb"
File.realpath __dir__ + "/helloworld2.rb"
File.stat __dir__ + "/helloworld1.rb"
File.stat __dir__ + "/helloworld2.rb"
It does not happen in $LOAD_PATH.resolve_feature_path(dir + "/helloworld1.rb")
Request¶
Would it be possible to cache the stat calls when using require?
I tried to implement a cache inside the ruby source code, but failed.
If not, is there now a way to combine ruby files into one?
I previously talked about require here: YJIT: Windows support lacking.
How to reproduce¶
Ruby versions: At least 3.0+, most likely older ones too.
Tested using Ruby Installer 3.1 and 3.2.
Procmon Software by Sysinternals
Files
Updated by aidog (Andi Idogawa) 10 months ago
Thanks to the new windows build docs by ioquatix, I made a test patch to check how much faster it would be if some of the repeated syscalls on the folders (c:/tmp/, c:/tmp/speedtest, gems and so on) are avoided:
tzinfo: 0.8s to 0.3s
gtk3: 2.8s to 2.5s (I see another similar issue inside the gem C code)
Windows has GetFinalPathNameByHandleW since Vista, which some other projects use for realpath. Would it work for Ruby?
Updated by nobu (Nobuyoshi Nakada) 10 months ago
- Status changed from Open to Assigned
- Assignee set to windows
Updated by joshc (Josh C) 9 months ago
I've also noticed a significant increase in file IO events (as reported by procmon) due to https://github.com/ruby/ruby/commit/79a4484a072e9769b603e7b4fbdb15b1d7eccb15 introduced in Ruby 3.1.0. The code tries to prevent the same file from being loaded twice by calling rb_realpath_internal
to see if the realpath has already been loaded. This is a problem on systems like Windows that use Ruby's emulated realpath, especially when there are deeply nested directories. I've attached a revert patch. It'd be great to use GetFinalPathNameByHandleW and avoid the emulate code.
Updated by jeremyevans0 (Jeremy Evans) 9 months ago
joshc (Josh C) wrote in #note-3:
I've attached a revert patch.
I think the only way we would revert 79a4484a072e9769b603e7b4fbdb15b1d7eccb15 is if someone can come up with an alternative approach to fixing Bug #17885.
It'd be great to use GetFinalPathNameByHandleW and avoid the emulate code.
If you mean to use this on Windows for the internals of File#realpath, I think we would be open to a backwards compatible patch for that, but @usa (Usaku NAKAMURA) would need to decide as he maintains the mswin64 platform.
Updated by MSP-Greg (Greg L) 9 months ago
Code using GetFinalPathNameByHandleW
already exists in win32/win32.c, see
https://github.com/ruby/ruby/blob/c43fbe4ebd2b519601f0b90ca98fa096799d3846/win32/win32.c#L2013-L2022
For cross-reference, see also Bug #19246 'Rebuilding the loaded feature index much slower in Ruby 3.1'
Updated by MSP-Greg (Greg L) 9 months ago
Just to be clear, this issue affects all Windows MRI platforms, so both mswin64 and mingw32 (mingw & ucrt builds) are affected.