Bug #15968
closedCustom marshal_load methods allow object instance variables to "leak" into other objects
Description
While working on a Rails app, I noticed some odd behavior where after marshalling and demarshalling an array of ActiveRecord objects, some elements were replaced with symbols and empty hashes (original Rails bug report).
It appears some of Rails' custom marshallization methods modify allow an object's unset instance variables to be set during marshallization. However, since these instance variables weren't counted at the start of marshallization, they overflow into subsequent array elements upon demarshallization.
Here is a test case (written in plain Ruby) demonstrating this behavior:
require 'test/unit'
class Foo
attr_accessor :bar, :baz
def initialize
self.bar = Bar.new(self)
end
end
class Bar
attr_accessor :foo
def initialize(foo)
self.foo = foo
end
def marshal_dump
self.foo.baz = :problem
{foo: self.foo}
end
def marshal_load(data)
self.foo = data[:foo]
end
end
class BugTest < Test::Unit::TestCase
def test_marshalization
foo = Foo.new
array = [foo, nil]
marshalled_array = Marshal.dump(array)
demarshalled_array = Marshal.load(marshalled_array)
assert_nil demarshalled_array[1]
end
end
I'm not positive this qualifies as a bug - if a programmer writes custom marshal_dump
and marshal_load
methods, perhaps it's their responsibility to avoid unintended side-effects like those demonstrated in my test case.
However, I think this issue might be altogether avoided by adding a reserved delimiter character to Ruby's core marshallization functionality (in marshal.c) representing the "end" of a serialized object. For instance, in the above test case, marshalled_array
comes out to:
\x04\b[\ao:\bFoo\x06:\t@barU:\bBar{\x06:\bfoo@\x06:\t@baz:\fproblem0
Suppose Ruby used a z
character to represent the end of a serialized object - in this case, marshalled_array
would come out to something like:
\x04\b[\ao:\bFoo\x06:\t@barU:\bBar{\x06:\bfoo@\x06:\t@baz:\fproblemz0
(Note the second-to-last character - z
.)
This way, when demarshalling an object, even if additional instance variables had somehow snuck in during marshallization process, the z
character could be used to mark the end of a serialized object, ensuring that the extra instance variables don't overflow into the next segment of serialized data.
I don't write much C, and I haven't fully grokked Ruby's marshal.c - so there may be dozens of reasons why this won't work. But I think a serialization strategy along those lines may help avoid unexpected behavior.
Updated by nobu (Nobuyoshi Nakada) over 5 years ago
- Backport changed from 2.4: UNKNOWN, 2.5: UNKNOWN, 2.6: UNKNOWN to 2.4: REQUIRED, 2.5: REQUIRED, 2.6: REQUIRED
Updated by nobu (Nobuyoshi Nakada) over 5 years ago
- Status changed from Open to Closed
Applied in changeset git|c9423b016cfeab852bc5a829e55e0a11f80b3ab7.
marshal.c: check instance variable count
- marshal.c (w_obj_each): ensure that no instance variable was
added while dumping other instance variables. [Bug #15968]
Updated by nobu (Nobuyoshi Nakada) over 5 years ago
- Description updated (diff)
The number of instance variables is placed just after the class name.
\x04\b[\ao:\bFoo\x06:\t@barU:\bBar{\x06:\bfoo@\x06:\t@baz:\fproblem0
^^^^
Adding instance variables after it is disallowed.
Updated by nagachika (Tomoyuki Chikanaga) almost 5 years ago
- Backport changed from 2.4: REQUIRED, 2.5: REQUIRED, 2.6: REQUIRED to 2.4: REQUIRED, 2.5: REQUIRED, 2.6: DONE
ruby_2_6 r67832 merged revision(s) c9423b016cfeab852bc5a829e55e0a11f80b3ab7,0b1e26398e018116180bf41cb63887f77d5d1b82,78ee2c245331e353e218b8fac9ca722a2bcd8fea.
Updated by usa (Usaku NAKAMURA) over 4 years ago
- Backport changed from 2.4: REQUIRED, 2.5: REQUIRED, 2.6: DONE to 2.4: REQUIRED, 2.5: DONE, 2.6: DONE
ruby_2_5 r67861 merged revision(s) c9423b016cfeab852bc5a829e55e0a11f80b3ab7,0b1e26398e018116180bf41cb63887f77d5d1b82,78ee2c245331e353e218b8fac9ca722a2bcd8fea.