Bug #6030

Thread-local "leak" in rb_exec_recursive*

Added by Charles Nutter about 2 years ago. Updated about 2 years ago.

[ruby-core:42665]
Status:Assigned
Priority:Low
Assignee:Nobuyoshi Nakada
Category:-
Target version:-
ruby -v:1.9.3 head Backport:

Description

I believe there may be a "leak" in the rbexecrecursive* functions in thread.c.

We have ported these methods for recursion detection in JRuby, and as in MRI they use a map inside a thread-local to track method name + object references. However, none of these method ever clean up the thread local when the recursive walk is complete.

The thread-local data is initialized in thread.c, recursivelistaccess, around line 3819 (in 1.9.3 branch):

if (NIL_P(hash) || TYPE(hash) != T_HASH) {
hash = rb_hash_new();
OBJ_UNTRUST(hash);
rb_thread_local_aset(rb_thread_current(), recursive_key, hash);

As far as I can tell, this thread-local is never cleared, holding a hash reference for as long as the thread is alive.

History

#1 Updated by Charles Nutter about 2 years ago

I forgot to point out a couple things...

  • By "leak" I mean this will hold references to data longer than it should. It's not a true leak in that it is bounded.

  • It is bounded, but only by method names; recursivelistaccess keys those thread-locals based on the current method name from the current frame, which means it will "leak" as many hash objects as there are unique methods utilizing rbexecrecursive*.

From recursivelistaccess:

...
VALUE sym = ID2SYM(rbframethisfunc());
...
list = rb
hashnew();
OBJ
UNTRUST(list);
rbhashaset(hash, sym, list);

In JRuby, I fixed this by having rbexecrecursiveouter clean up the thread-local upon completion, and I added recursivelistoperation (Ruby.java, recursiveListOperation) as a new "outermost" function to wrap our version of rbexecrecursive. our rbexecrecursive also gained an safeguard + error if it is called without being inside recursivelist_operation.

The commits are here:

https://github.com/jruby/jruby/commit/8f2ba4214eb070a83951aef1f143ebf6ebcecce4

https://github.com/jruby/jruby/commit/f7c1ffd93f34cef816168f36f5fe4f960a9ac2e2

It was a more severe problem for JRuby because the hash (a RubyHash) held references to the current JRuby instance, keeping it alive for as long as the thread was alive. Still a "leak", but a larger amount of bounded data held for too long.

#2 Updated by Marc-Andre Lafortune about 2 years ago

Hi.

I'm not sure I see where there is a problem.

If Array#==, say, is called in a given thread, it's quite likely that it will be called again later on, so why not keep the reference to the empty hash ready for next call?

As long as these hashes are empty at the end of the rbexecrecursive_*, it's not clear that there is much memory to recoup. It could also affect the performance.

#3 Updated by Charles Nutter about 2 years ago

It would be possible to bloat up memory use by calling many method names that all hit this logic; saving a hash per name and never releasing it seems unnecessary.

If it's not a big enough deal for MRI to fix, so be it...it was an issue in JRuby because of the additional rooting of JRuby instances, but JRuby support MVM.

For that matter, this also is highly unlikely to work with the MVM stuff, though I know that's still off in experimental-land anyway.

It just seems like a good idea to clean up after the recursive logic is done, rather than leaving a dirty hash lying around.

#4 Updated by Koichi Sasada about 2 years ago

  • Assignee set to Nobuyoshi Nakada
  • Priority changed from Normal to Low

#5 Updated by Shyouhei Urabe about 2 years ago

  • Status changed from Open to Assigned

Also available in: Atom PDF