Bug #11264
closedMemory leak in JSON stdlib ext (JSON generation)
Description
Hi,
I'm not sure if this is a bug, or just undocumented behaviour, but here's a script to reproduce the memory leak:
require 'json'
class MyClass
def to_json(*)
"a" * 1048576 # 1 megabytes of chars
end
end
class MyOther
def to_json(*)
raise "OMG"
end
end
1000.times do |i| # will leak up to ~ 4 gigs
puts i
JSON.dump([MyClass.new, MyClass.new, MyClass.new, MyOther.new]) rescue nil
end
What's happening is that the C extension is iterating over the array to eventually dump it out to JSON. It's going through the array in order, appending to the fbuffer
as needed. The problem is that that the API extension point of adding a to_json
method to a class (or object), without wrapping the code in some sort of 'begin...rescue , free(buffer), re-raise' block results in the buffer never being freed. Normally this isn't too bad, except if a lot of data was appended to the buffer before the error got raised.
To test it against normal behaviour in the above script, take out the offending MyOther.new
in the array. It should run much more smoothly this way :)
Note that since the fbuffer
s aren't GC marked (not that they should be), it isn't possible to trace this leak using GC.stat
.
Once again, not sure if this is a bug or if we should never raise errors from custom to_json
methods (ie: always wrap them in a begin... rescue block.
Thanks,
I also reported this to the JSON gem maintainer here: https://github.com/flori/json/issues/251
Updated by nobu (Nobuyoshi Nakada) almost 9 years ago
- Description updated (diff)
- Status changed from Open to Third Party's Issue
Luke Gruber wrote:
Once again, not sure if this is a bug or if we should never raise errors from custom to_json methods (ie: always wrap them in a begin... rescue block.
It doesn't work enough, as exceptions can raise even inside cState_prepare_buffer()
.
Updated by luke-gru (Luke Gruber) almost 9 years ago
Thanks for getting back to me.
That's true, but my guess is the most common leak scenario would be when calling a method that's basically an extension point to the library itself (such as to_json
. At least if you leak the internals of the buffer, it won't be too many bytes per leak :) Otherwise you would have to wrap the whole 'cState_partial_generate' method in a rescue block.
As the library stands now, lots of methods that raise will result in a leak, not just to_json
:
Object#respond_to?(:to_json)
Object#to_s
Array#[]
Hash#keys
Hash#[]
String#encode
But the chances of those methods raising errors are much lower, I think.
Should we continue this here or on https://github.com/flori/json/issues/251 ? I don't know what the right approach is
for std libs that are maintained separately as gems.
Updated by nobu (Nobuyoshi Nakada) almost 9 years ago
Luke Gruber wrote:
Should we continue this here or on https://github.com/flori/json/issues/251 ? I don't know what the right approach is
for std libs that are maintained separately as gems.
Yes, I've posted a patch there.