Project

General

Profile

Actions

Bug #16278

closed

Potential memory leak when an hash is used as a key for another hash

Added by cristiangreco (Cristian Greco) over 2 years ago. Updated over 2 years ago.

Status:
Rejected
Priority:
Normal
Assignee:
-
Target version:
-
ruby -v:
ruby 2.6.5p114 (2019-10-01 revision 67812) [x86_64-darwin18]
[ruby-core:95519]

Description

Hi,

I've been hitting what seems to be a memory leak.

When an hash is used as key for another hash, the former object will be retained even after multiple GC runs.

The following code snippet demonstrates how the hash {:a => 1} (which is never used outside the scope of create) is retained even after 10 GC runs (find will look for an object with a given object_id on heap).

# frozen_string_literal: true

def create
  h = {{:a => 1} => 2}
  h.keys.first.object_id
end

def find(object_id)
  ObjectSpace.each_object(Hash).any?{|h| h.object_id == object_id} ? 1 : 0
end


leaked = create

10.times do
  GC.start(full_mark: true, immediate_sweep: true)
end

exit find(leaked)

This code snippet is expected to exit with 0 while it exits with 1 in my tests. I've tested this on multiple recent ruby versions and OSs, either locally (OSX with homebrew) or in different CIs (e.g. here).

Can you please help understand what's going on here? Thanks!

Updated by mame (Yusuke Endoh) over 2 years ago

  • Status changed from Open to Rejected

Ruby uses so-called conservative GC, which is inexact. It is not guaranteed that an object is collected when it becomes unreachable.

You can confirm that it does not cause leak by the following code:

def create
  h = {{:a => 1} => 2}
  h.keys.first.object_id
end

loop { create }

This code uses constant memory. If it caused memory leak, the memory usage would continue to increase.

Updated by cristiangreco (Cristian Greco) over 2 years ago

  • Status changed from Rejected to Open

mame (Yusuke Endoh) wrote:

This code uses constant memory. If it caused memory leak, the memory usage would continue to increase.

Thank for your answer!

If an application exercises this pattern very frequently during lifetime and across multiple processes then it’s definitely going to bloat memory, at the very least. As a real-world example, this is causing high memory usage for the Prometheus client gem, where such pattern is heavily used when passing around metric labels.

As Ruby allows any object to be potentially used as hash key, this behaviour is easily going to cause troubles in long running application and to confuse developers.

I understand it might not be easy or cheap to change the way it works now, but hope you agree with me that application code should not be designed to work around these implementation-specific details. Do you think there’s any viable approach to alleviate this type of issues in future releases?

Updated by ko1 (Koichi Sasada) over 2 years ago

If an application exercises this pattern very frequently during lifetime and across multiple processes then it’s definitely going to bloat memory, at the very least. As a real-world example, this is causing high memory usage for the Prometheus client gem, where such pattern is heavily used when passing around metric labels.

Could you give us good example with that gem? As mame-san said, your example is not a problem because "This code uses constant memory".

Updated by cristiangreco (Cristian Greco) over 2 years ago

ko1 (Koichi Sasada) wrote:

If an application exercises this pattern very frequently during lifetime and across multiple processes then it’s definitely going to bloat memory, at the very least. As a real-world example, this is causing high memory usage for the Prometheus client gem, where such pattern is heavily used when passing around metric labels.

Could you give us good example with that gem? As mame-san said, your example is not a problem because "This code uses constant memory".

Hi Koichi, thanks for taking the time to look into this!

This example creates a metric of type counter, updates it 3 times using different labels and prints out an export of the metric registry:

# frozen_string_literal: true

require "prometheus/client"
require "prometheus/client/formats/text"

require "prometheus/client/data_stores/synchronized"
# require "prometheus/client/data_stores/single_threaded"
# require "prometheus/client/data_stores/direct_file_store"

Prometheus::Client.config.data_store = Prometheus::Client::DataStores::Synchronized.new
# Prometheus::Client.config.data_store = Prometheus::Client::DataStores::SingleThreaded.new
# Prometheus::Client.config.data_store = Prometheus::Client::DataStores::DirectFileStore.new(dir: ".")

def test
  registry = Prometheus::Client::Registry.new
  counter = registry.counter(
    :counter_metric,
    docstring: "a counter",
    labels: %i[label1 label2],
  )

  labels1 = { label1: "foo", label2: "bar" }
  labels2 = { label1: 1, label2: 2 }
  labels3 = { label1: :a, label2: :b }

  counter.increment(by: 1, labels: labels1)
  counter.increment(by: 1, labels: labels2)
  counter.increment(by: 1, labels: labels3)

  puts Prometheus::Client::Formats::Text.marshal(registry)

  [labels1.object_id, labels2.object_id, labels3.object_id]
end

def find(id)
  ObjectSpace.each_object(Hash).each { |h| id == h.object_id }
end

retained_ids = test

10.times do
  GC.start(full_mark: true, immediate_sweep: true)
end

retained_ids.each { |id| puts "found #{id}" if find(id) }

Using each store I found that the 3 labels hashes are retained after garbage collection (all the stores use a similar pattern of storing labels hashes within another hash).

What do you think is going on here?

Updated by jeremyevans0 (Jeremy Evans) over 2 years ago

cristiangreco (Cristian Greco) wrote:

Using each store I found that the 3 labels hashes are retained after garbage collection (all the stores use a similar pattern of storing labels hashes within another hash).

What do you think is going on here?

object_id is only unique for the life of the object. After the object is garbage collected, the same object_id could be used for a different object. So measuring using object_id is not a good idea. Maybe use ObjectSpace.define_finalizer in your testing? I'm not sure this affects your particular test program, it is just a good general principle.

Can you remove the usage of Prometheus in your test script, and design a test script using only core Ruby classes, so we can be sure that Prometheus is not holding any references to the hashes?

Updated by cristiangreco (Cristian Greco) over 2 years ago

jeremyevans0 (Jeremy Evans) wrote:

object_id is only unique for the life of the object. After the object is garbage collected, the same object_id could be used for a different object. So measuring using object_id is not a good idea. Maybe use ObjectSpace.define_finalizer in your testing? I'm not sure this affects your particular test program, it is just a good general principle.

Can you remove the usage of Prometheus in your test script, and design a test script using only core Ruby classes, so we can be sure that Prometheus is not holding any references to the hashes?

Hi Jeremy!

I reported an example code using the prometheus gem as requested by Koichi.

The usage of ObjectSpace.define_finalizer seems to always force objects to be retained in my tests. However, it seems very unlikely that object_ids are re-used in such short code.

I've put together another snippet that uses again object_id for comparison. In my tests h3 and[EDIT: only h4] h4 is always present on heap after GC:

# frozen_string_literal: true

require 'objspace'

class Klass; end

def create
  h1 = { :a => 1 }
  h2 = { Klass.new => 2 }
  h3 = { [Klass.new] => 3 }
  h4 = { { :a => Klass.new } => 4 }

  $id_h1 = h1.object_id
  $id_h2 = h2.object_id
  $id_h3 = h3.object_id
  $id_h4 = h4.object_id

  nil
end

10.times do
  GC.start(full_mark: true, immediate_sweep: true)
end

create

10.times do
  GC.start(full_mark: true, immediate_sweep: true)
end

ObjectSpace.each_object(Hash) do |h|
  puts "found h1" if h.object_id == $id_h1
  puts "found h2" if h.object_id == $id_h2
  puts "found h3" if h.object_id == $id_h3
  puts "found h4" if h.object_id == $id_h4
end

For the sake of completeness, this bug report originates from a strange behaviour I've observed using the MemoryProfiler gem: https://github.com/SamSaffron/memory_profiler/issues/81

Updated by jeremyevans0 (Jeremy Evans) over 2 years ago

Here's a modified version of your script, fixing the issue where $id_h4 = h3.object_id, and showing the actual contents of the objects found with those ids:

require 'objspace'

class Klass; end

def create
  h1 = { :a => 1 }
  h2 = { Klass.new => 2 }
  h3 = { [Klass.new] => 3 }
  h4 = { { :a => Klass.new } => 4 }

  $id_h1 = h1.object_id
  $id_h2 = h2.object_id
  $id_h3 = h3.object_id
  $id_h4 = h4.object_id

  nil
end

10.times do
  GC.start(full_mark: true, immediate_sweep: true)
end

create

10.times do
  GC.start(full_mark: true, immediate_sweep: true)
end

ObjectSpace.each_object(Hash) do |h|
  puts "found h1: #{ObjectSpace._id2ref($id_h1)}" if h.object_id == $id_h1
  puts "found h2: #{ObjectSpace._id2ref($id_h2)}" if h.object_id == $id_h2
  puts "found h3: #{ObjectSpace._id2ref($id_h3)}" if h.object_id == $id_h3
  puts "found h4: #{ObjectSpace._id2ref($id_h4)}" if h.object_id == $id_h4
end

Here's the output from running it a few times

# first run
found h4: {{:a=>#<Klass:0x00000615fa257108>}=>4}
found h1: {:a=>1}

# second run
found h1: {:a=>1}
found h4: {{:a=>#<Klass:0x0000032a2e2b2c20>}=>4}
found h3: {}
found h2: {:full_mark=>true, :immediate_sweep=>true}

# third run
found h4: {{:a=>#<Klass:0x0000067d0e3a2578>}=>4}
found h1: {:a=>1}

# fourth run (this when compiled with optimizations)
found h4: {{:a=>#<Klass:0x0000042b0e833c30>}=>4}

It does seem to always retain h4, but that could be due to the conservative GC. When compiled without optimizations (-O0), h1 seems to always be retained as well. On my system, h4 seems to always be retained in ruby 2.1 and 2.3-2.7, not always retained in 2.2 and the branch I have to fully separate keyword arguments from positional arguments (https://github.com/jeremyevans/ruby/tree/r3). As those changes seem unrelated, I'm going to assume those versions just change the memory layout such that the conservative GC no longer retains h4. The fact that h1 is always retained when compiled without optimizations lends weight to this theory.

I agree with @mame (Yusuke Endoh) that unless you are seeing unbounded memory growth, this is not a bug, just a result of the conservative GC.

Updated by nobu (Nobuyoshi Nakada) over 2 years ago

Adding this method and calling it after create clears the output.

def garbage
  h1 = h2 = h3 = h4 = h5 = h6 = h7 = h8 = h9 = h10 = nil
end

So a “shadow” seems staying on the VM stack.

Updated by cristiangreco (Cristian Greco) over 2 years ago

Hi Jeremy, thanks for these details!

I don’t know the details of ruby’s GC, seems to me it might behave unpredictably sometimes. I guess what confuses me now is that although that object is retained we don’t observe unbounded memory growth when calling the method in a loop: is it eventually collected? Is it somehow re-used, or what?

Another thing that intrigues me is that this behaviour is not always consistent: it might vary across different ruby builds (you mentioned that memory layout and compiler optimisations might affect the output) and is also subtly influenced by code changes. For example, if I add this line puts ObjectSpace.count_objects[:T_HASH] just after the call to create then h4 is almost never found after GC (or its reference is found associated to another object). Not sure why this happens though!

nobu (Nobuyoshi Nakada) wrote:

Adding this method and calling it after create clears the output.

def garbage
  h1 = h2 = h3 = h4 = h5 = h6 = h7 = h8 = h9 = h10 = nil
end

So a “shadow” seems staying on the VM stack.

Hey Nobu, what is going on here? What means that a shadow is retained on the stack?

Updated by jeremyevans0 (Jeremy Evans) over 2 years ago

  • Status changed from Open to Rejected

cristiangreco (Cristian Greco) wrote:

Hi Jeremy, thanks for these details!

I don’t know the details of ruby’s GC, seems to me it might behave unpredictably sometimes. I guess what confuses me now is that although that object is retained we don’t observe unbounded memory growth when calling the method in a loop: is it eventually collected? Is it somehow re-used, or what?

It is eventually collected when Ruby's GC can no longer find a reference to it. The reason it may be retained even though there is no direct reference to it in Ruby is that pointers to the object may still exist somewhere on the the C/Ruby stack. Ruby's GC is conservative, part of the scan for objects is just looking at at stack memory, and if any memory could be considered a pointer to a Ruby object, that Ruby object is retained during that GC pass.

This conservative GC is needed to make C extensions work. C local variables may hold pointers to Ruby objects, and collecting the Ruby objects while there is a C local variable pointing to them can cause undefined behavior (often a segfault).

Another thing that intrigues me is that this behaviour is not always consistent: it might vary across different ruby builds (you mentioned that memory layout and compiler optimisations might affect the output) and is also subtly influenced by code changes. For example, if I add this line puts ObjectSpace.count_objects[:T_HASH] just after the call to create then h4 is almost never found after GC (or its reference is found associated to another object). Not sure why this happens though!

It happens because the memory layout changed, and there are no longer locations in memory on the stack that point to the objects.

nobu (Nobuyoshi Nakada) wrote:

Adding this method and calling it after create clears the output.

def garbage
  h1 = h2 = h3 = h4 = h5 = h6 = h7 = h8 = h9 = h10 = nil
end

So a “shadow” seems staying on the VM stack.

Hey Nobu, what is going on here? What means that a shadow is retained on the stack?

It just means there are locations in memory on the stack that point to the Ruby objects. By calling another method that sets a bunch of local variables, the locations in memory that previously pointed to the hashes now are modified to contain nil (Qnil in C, 0x4 or 0x8 depending on processor type). Since no locations in memory point to the hashes, the hashes can then be GCed.

Hopefully that provides an adequate explanation for Ruby's GC. For more background, please review some conference presentations by @ko1 (Koichi Sasada) : http://www.atdot.net/~ko1/activities/. I'm going to close this now. Please only reopen if you can show a case with unbounded memory growth.

Updated by alanwu (Alan Wu) over 2 years ago

The GC scans the native stack for Ruby's C code to find values to retain.
Pointers to Ruby's heap objects can end up on the native stack for a variety
of reasons, and this is mostly up to the C compiler. Whether a pointer ends up
on the native stack depends on how Ruby is built, inlining decisions, register
alloation and a bunch of other things -- this is not something Ruby's code has
direct control over.

The generated machine code might interact in a way such that certain sequence
of Ruby code overwrite heap pointers on the native stack, letting the GC
collect those objects.

To illustrate, here is a modified version of your prometheus script:

# frozen_string_literal: true

require "prometheus/client"
require "prometheus/client/formats/text"

require "prometheus/client/data_stores/synchronized"

Prometheus::Client.config.data_store = Prometheus::Client::DataStores::Synchronized.new

def test
  registry = Prometheus::Client::Registry.new
  counter = registry.counter(
    :counter_metric,
    docstring: "a counter",
    labels: %i[label1 label2],
  )

  labels1 = { label1: "foo", label2: "bar" }
  labels2 = { label1: 1, label2: 2 }
  labels3 = { label1: :a, label2: :b }

  counter.increment(by: 1, labels: labels1)
  counter.increment(by: 1, labels: labels2)
  counter.increment(by: 1, labels: labels3)

  puts Prometheus::Client::Formats::Text.marshal(registry)

  ret = [labels1.object_id, labels2.object_id, labels3.object_id]
  return ret
  3.itself
  ret
end

def find(id)
  ObjectSpace.each_object(Hash).find { |h| id == h.object_id }
end

retained_ids = test

10.times do
  GC.start(full_mark: true, immediate_sweep: true)
end

retained_ids.each { |id|
  obj = find(id)
  puts "found #{id} #{obj}" if obj
}

Tracing with GDB, you can find that labels3 is retained because the compiler
chose to spill the receiver onto the native stack whenever a call without block
happens:

insns.def:
763     calling.block_handler = VM_BLOCK_HANDLER_NONE;
   0x00005555557284ee <+174>:   mov    %r15,%rbx
   0x00005555557284f1 <+177>:   movq   $0x0,0x70(%rsp)

764     vm_search_method(ci, cc, calling.recv = TOPN(calling.argc = ci->orig_argc));
   0x00005555557284fa <+186>:   movslq 0xc(%rcx),%rax
   0x00005555557284fe <+190>:   mov    %eax,0x80(%rsp)
   0x0000555555728505 <+197>:   shl    $0x3,%rax
   0x0000555555728509 <+201>:   sub    %rax,%rdx
   0x000055555572850c <+204>:   mov    -0x8(%rdx),%rax
   0x0000555555728510 <+208>:   mov    %rax,0x78(%rsp) <<<< Stores pointer to Ruby object on the native stack <<<<<<<

./include/ruby/ruby.h:
2058        if (RB_IMMEDIATE_P(obj)) {
=> 0x0000555555728515 <+213>:   test   $0x7,%al
   0x0000555555728517 <+215>:   je     0x55555572b24b <vm_exec_core+11787>

2059        if (RB_FIXNUM_P(obj)) return rb_cInteger;
   0x000055555572851d <+221>:   mov    0x1b56a4(%rip),%r12        # 0x5555558ddbc8 <rb_cInteger>
   0x0000555555728524 <+228>:   test   $0x1,%al
   0x0000555555728526 <+230>:   jne    0x555555728555 <vm_exec_core+277>

labels3 happens to be the last receiver in the method so it is left on the
stack and retained. We can allow the GC to collect labels3 by performing
another method call on some other object before the end of the test method.
Commenting out return ret, the VM spills 3 onto the same location that
labels3 occupied, allowing the GC to collect labels3.

Again, all this is up to the exact machine code output of your compiler, so
what I'm seeing here on my setup might be completely different from yours.
This kind of object retainment can happen anywhere in the Ruby's C code and
can easily be introduced or removed by changes to the VM or the Ruby code in
question.

This kind of extraneous retainment is an inherent property of Ruby's GC.
There is no memory growth since the fixed size native stack puts a cap on
the number of objects that could be retained in this manner.

This could be the reason to the memory bloat problems in your app, but I think
other explanations are more plausible. I would suggest taking another look at
your heap profile. There should be other, more actionable instances of
object retainment in your app.

Updated by cristiangreco (Cristian Greco) over 2 years ago

Thanks both @jeremyevans0 (Jeremy Evans) @alanwu (Alan Wu) for the rich and clear explanations!

Actions

Also available in: Atom PDF