Feature #10869
openAdd support for option to pre-compile Ruby files
Description
I know this topic is tricky but please bear with me.
Goal: improve performance on files loading to speed up the boot process of some application requiring lots of files/gems.
Background:
Currently most frameworks/gems rely on the autoload feature to allow applications to load faster by lazy loading files as missing constants are referenced.
Autoload behavior may lead to hard-to-understand bugs and I believe this is the main reason why Matz discourages its usage:
https://bugs.ruby-lang.org/issues/5653
I described a bug involving autoload in a real scenario in this comment of this same issue:
https://bugs.ruby-lang.org/issues/5653#note-26
While I agree that autoload should be discouraged I think we should provide an alternative for speeding up application loading.
Overall benchmarks:
I decided to create a simple benchmark in order to measure how much time MRI would take to load 10_000 files containing a hundred methods each:
10000.times{|j| File.open("test#{j}.rb", 'w'){|f|f.puts "class A#{j}"; 100.times{|i| f.puts " def m#{i}; end"}; f.puts "end"}}
time ruby -r benchmark -I. -e 'puts Benchmark.realtime{10000.times{|i|require "test#{i}"}}
8.766814350005006
real 0m10.068s
user 0m9.416s
sys 0m0.532s
time cat test*.rb > /dev/null
real 0m0.107s
user 0m0.068s
sys 0m0.040s
As you can see, most of the time is spent on MRI itself rather than on disk. Using require_relative doesn't make any real difference either.
Suggested solution: Pre-compiled files
I know nothing about MRI internals but I suspect that maybe if MRI could support some sort of database containing a precompiled version of the files (the bytecodes maybe). The database would store the size and a hash for each processed file. If the size and hash remain the same it would assume the bytecodes in the database are up-to-date, which should happen in most cases. In this case those files could be possibly loaded much faster.
In order to avoid additional overhead or some bugs in some cases, maybe an option to enable the pre-compile behavior would be better to allow us to test this approach.
I understand that it may be complicated to precompile all kind of Ruby files as they could execute code as well rather than simply declaring classes. In such cases I still think it would worth to detect such cases and skip pre-compiling for such files and only pre-compile those files containing simple class declarations only, which is the case for a lot of files already. Maybe this could potentially make gem owners move their statements to a separate file in order to allow the classes to be precompiled in the future...
Do you have any other suggestions to speed up application loading that do not involve autoload and conditional requires? Do you think precompilation is possible/worthy on MRI?