Bug #14372
closedMemory leak in require with Pathnames in the $LOAD_PATH in 2.3/2.4
Description
There is a memory leak that we have found on ruby 2.3.6 and 2.4.3 that happens on Mac OSX and Linux. Ruby 2.2.6 and 2.5.0 do not leak. We have not tested other platforms.
If $LOAD_PATH
contains one or more Pathname objects, require
without a fully qualified path, such as require 'ostruct'
, will leak if you do this require many times.
For example, the following script will leak very quickly on ruby 2.3.6 and 2.4.3:
require 'pathname'
puts Process.pid
$LOAD_PATH.unshift(Pathname.new(__dir__))
dot = "."
filename = "ostruct"
1000.times { 1000.times { require filename }; print dot; GC.start; }
From what we can understand, it appears that rb_require_internal
calls rb_feature_p
which ultimately calls rb_file_expand_path_fast
and resizes a string. It doesn't seem like the memory is ever freed either in c or the garbage collector. This happens many times, perhaps because Pathname objects, unlike Strings, aren't cached in loaded features so they get expanded each time. See below:
https://github.com/ruby/ruby/blob/v2_3_6/load.c#L43-L47
We can workaround this problem by converting Pathname objects in $LOAD_PATH
to Strings, but this leak should be fixed since this it's common to use Pathname objects in $LOAD_PATH
.
We used the instruments tool on OSX to show one such leak and the callstack, see attached image.
Files
Updated by jrafanie (Joe Rafaniello) over 6 years ago
It's worth mentioning that more Pathname
objects in the $LOAD_PATH
may make this leak worse as even on ruby 2.5.0, the time for require increases with each added Pathname to the $LOAD_PATH
.
Updated by jrafanie (Joe Rafaniello) over 6 years ago
I did a small change to see how the number of Pathnames in the $LOAD_PATH
changes the leak amount at the script's completion.
It looks like the memory leak is linear:
1 74.6 MB
2 149.5 MB
3 214 MB
4 290 MB
5 353.6 MB
9 575.4 MB
10 650.6 MB
Here's the script:
require 'pathname'
puts Process.pid
puts ARGV[0]
(ARGV[0] || 1).to_i.times { $LOAD_PATH.unshift(Pathname.new(__dir__) ) }
dot = "."
filename = "ostruct"
1000.times { 1000.times { require filename }; print dot; GC.start; }
STDOUT.puts "exit?"
STDIN.gets
Additionally, I measured the time to do the requires only changing the number of Pathnames in the $LOAD_PATH
:
1 Pathname, 6.1s
2 Pathname, 8s
3 Pathname, 10.1s
4 Pathname, 12.2s
5 Pathname, 13.4s
9 Pathname, 20.3s
10 Pathname, 22.2s
require 'pathname'
puts Process.pid
puts ARGV[0]
(ARGV[0] || 1).to_i.times { $LOAD_PATH.unshift(Pathname.new(__dir__) ) }
dot = "."
filename = "ostruct"
1000.times { 1000.times { require filename }; print dot; GC.start; }
Updated by jrafanie (Joe Rafaniello) over 6 years ago
Because Rails.root
is a Pathname
, it's a fairly common for developers to use Rails.root.join("lib") or something similar in their autoload_paths or eager_load_paths, both of which end up in the $LOAD_PATH and lead to a leak on each call to require.
This memory has been found in various open source projects and workarounds provided (convert Pathname
objects destined for the $LOAD_PATH
to strings):
https://github.com/lobsters/lobsters/pull/449
https://github.com/opf/openproject/pull/6148
https://github.com/errbit/errbit/pull/1257
https://github.com/ManageIQ/manageiq/pull/16837
https://github.com/ManageIQ/manageiq-api/pull/288
https://github.com/ManageIQ/manageiq-automation_engine/pull/146
https://github.com/ManageIQ/manageiq-ui-classic/pull/3266
https://github.com/ManageIQ/manageiq-graphql/pull/34
Updated by oliverguenther (Oliver Günther) over 6 years ago
jrafanie (Joe Rafaniello) wrote:
it's a fairly common for developers to use Rails.root.join("lib") or something similar in their autoload_paths or eager_load_paths, both of which end up in the $LOAD_PATH and lead to a leak on each call to require.
As linked by Joe above, exactly this happened to us at OpenProject and had a noticeable impact. I assume this will affect all larger Rails applications. As such, we endorse addressing of this bug.
Best,
Oliver
Updated by wanabe (_ wanabe) over 6 years ago
Updated by nobu (Nobuyoshi Nakada) over 6 years ago
- Related to Bug #10222: require_relative and require should be compatible with each other when symlinks are used added
Updated by nobu (Nobuyoshi Nakada) over 6 years ago
- Status changed from Open to Closed
Updated by jrafanie (Joe Rafaniello) over 6 years ago
I opened a backport bug so we can have this memory leak fixed in ruby 2.3 and 2.4:
https://bugs.ruby-lang.org/issues/14424
Updated by nagachika (Tomoyuki Chikanaga) over 6 years ago
- Backport changed from 2.3: UNKNOWN, 2.4: UNKNOWN, 2.5: UNKNOWN to 2.3: REQUIRED, 2.4: DONE, 2.5: DONTNEED
ruby_2_4 r62440 merged revision(s) 59983,59984.