https://bugs.ruby-lang.org/https://bugs.ruby-lang.org/favicon.ico?17113305112012-10-10T23:39:44ZRuby Issue Tracking SystemRuby master - Bug #7135: GC bug in Ruby 1.9.3-p194?https://bugs.ruby-lang.org/issues/7135?journal_id=301732012-10-10T23:39:44Znobu (Nobuyoshi Nakada)nobu@ruby-lang.org
<ul><li><strong>Status</strong> changed from <i>Open</i> to <i>Feedback</i></li></ul><p>I can't reproduce it.</p> Ruby master - Bug #7135: GC bug in Ruby 1.9.3-p194?https://bugs.ruby-lang.org/issues/7135?journal_id=301922012-10-11T05:12:38Zalexdowad (Alex Dowad)alexinbeijing@gmail.com
<ul></ul><p>Nobuさん, I don't expect that you (or anyone else) would be able to reproduce this bug. As I said, it doesn't happen when I extract the part which is failing from Prawn, only when I run the tests against the whole thing (which I have modified -- I'm working on performance). This is not strange -- in general, memory corruption/pointer bugs are sensitive to the exact layout of data in memory, and changing small things in a program may randomly turn the bug on or off.</p>
<p>As I said, I can try to dig deeper and diagnose the bug myself, but I need some advice on where to add "debug" code to the Ruby source (so I can recompile, run the code which is failing, and try to get more information on what is actually happening).</p>
<p>To sum up the problem again, I have Ruby Strings which are randomly being overwritten (although nothing at the Ruby level is modifying them), and it only happens when the GC runs. Actually, I just discovered that if I put a call to "GC.start" in the "string.codepoints.inject" loop, the error happens <em>every time</em>. UNLESS I freeze the string -- then the error never happens, even with "GC.start" in the loop.</p>
<p>A few questions for someone who knows Ruby internals well:</p>
<ul>
<li>When Ruby GCs an unused object, does it zero out the memory used?</li>
<li>How about when a new object is allocated?</li>
<li>I've heard that Ruby stores the contents of small strings directly in an RObject (or RValue or whatever it is...) union. The String which is being corrupted has 7 bytes. Will a String like that <em>always</em> be embedded, or is it possible that it could still use malloc'd memory for the contents?</li>
<li>In the tests which I am doing right now, it always seems that byte 0 is untouched, byte 1 is changed to 1, and bytes 2-6 are changed to 0. Do those values seem familiar? Is there a different type of object which can go in the same union, which would set those particular bytes?</li>
</ul> Ruby master - Bug #7135: GC bug in Ruby 1.9.3-p194?https://bugs.ruby-lang.org/issues/7135?journal_id=302092012-10-11T06:23:33Znormalperson (Eric Wong)normalperson@yhbt.net
<ul></ul><p>"alexdowad (Alex Dowad)" <a href="mailto:alexinbeijing@gmail.com" class="email">alexinbeijing@gmail.com</a> wrote:</p>
<blockquote>
<p>Nobuさん, I don't expect that you (or anyone else) would be able to reproduce this bug. As I said, it doesn't happen when I extract the part which is failing from Prawn, only when I run the tests against the whole thing (which I have modified -- I'm working on performance). This is not strange -- in general, memory corruption/pointer bugs are sensitive to the exact layout of data in memory, and changing small things in a program may randomly turn the bug on or off.</p>
</blockquote>
<p>Does this happen with unmodified Prawn at all?</p>
<p>I'm not familiar with Prawn, but does any of its dependencies pull in<br>
extra C extension which may have memory corruption bugs?</p>
<p>Can you share your work-in-progress changes to Prawn?</p>
<blockquote>
<ul>
<li>When Ruby GCs an unused object, does it zero out the memory used?</li>
</ul>
</blockquote>
<p>No</p>
<blockquote>
<ul>
<li>How about when a new object is allocated?</li>
</ul>
</blockquote>
<p>Yes.</p>
<blockquote>
<ul>
<li>I've heard that Ruby stores the contents of small strings directly<br>
in an RObject (or RValue or whatever it is...) union. The String which<br>
is being corrupted has 7 bytes. Will a String like that <em>always</em> be<br>
embedded, or is it possible that it could still use malloc'd memory<br>
for the contents?</li>
</ul>
</blockquote>
<p>It's possible to use malloc'ed memory for short string contents<br>
(string capacity can be larger than length)</p> Ruby master - Bug #7135: GC bug in Ruby 1.9.3-p194?https://bugs.ruby-lang.org/issues/7135?journal_id=302142012-10-11T06:50:50Zalexdowad (Alex Dowad)alexinbeijing@gmail.com
<ul></ul><blockquote>
<p>Does this happen with unmodified Prawn at all?</p>
</blockquote>
<p>Good question. I haven't spent a lot of time repeatedly running the spec tests for "unmodified" Prawn. Generally when I run some tests, it's because I'm contributing a patch to the gem, and I want to make sure I haven't broken anything.</p>
<p>I can tell you, though, that a few weeks ago, when I was working on a completely unrelated patch to Prawn, I also started getting an intermittent "invalid byte sequence in UTF-8" error when I was testing against Ruby 1.9.2. When I was tracing the error, I found the same thing with String#codepoints returning inconsistent results. I discovered that freezing the String made the problem "go away", and did a little reading of the Ruby source (which revealed that String#codepoints seems to treat frozen strings specially). It never occurred to me at the time that the problem might have anything to do with the GC, and I didn't pursue it further until now.</p>
<blockquote>
<p>I'm not familiar with Prawn, but does any of its dependencies pull in<br>
extra C extension which may have memory corruption bugs?</p>
</blockquote>
<p>No. The core team has committed to <em>never</em> using binary gems, only pure Ruby.</p>
<blockquote>
<p>Can you share your work-in-progress changes to Prawn?</p>
</blockquote>
<p>Do you really want to spend a few hours or days of your life helping to track down an obscure memory bug? If so, I'll push the code I am working on to GitHub.</p>
<p>At this point I think I have already got as much information as I can from Ruby-land -- I need to crack Ruby open and drop down into C-land. Right now I'm mainly fishing for information and ideas which will help when I do that...</p> Ruby master - Bug #7135: GC bug in Ruby 1.9.3-p194?https://bugs.ruby-lang.org/issues/7135?journal_id=302152012-10-11T07:00:53Zalexdowad (Alex Dowad)alexinbeijing@gmail.com
<ul></ul><p>OK, I found a couple more significant things:</p>
<ol>
<li>I can reproduce the problem on Ruby 1.9.2 and 1.9.3, but never 1.8.7.</li>
<li>I tried putting calls to "print string.bytes.to_a" and "print string.codepoints.to_a" inside the "string.codepoints.inject" loop. They <em>always</em> print the correct sequence of values, even when the codepoints being passed in to the "inject" block are incorrect!</li>
</ol>
<p>Could the GC be moving the String, leaving the Enumerator returned by #codepoints pointing to an unused block of memory?</p> Ruby master - Bug #7135: GC bug in Ruby 1.9.3-p194?https://bugs.ruby-lang.org/issues/7135?journal_id=302162012-10-11T07:03:47Zalexdowad (Alex Dowad)alexinbeijing@gmail.com
<ul></ul><p>More information:</p>
<p>When I restructure the code to avoid using an Enumerator, like this:</p>
<pre><code> s = 0
string.codepoints do |r|
GC.start if $my_debug
if $my_debug
print r, "(", string.codepoints.to_a.inspect, "),"
end
s += character_width_by_code(r)
end
result = s * scale
</code></pre>
<p>...the problem still occurs under Ruby 1.9.2 and 1.9.3.</p> Ruby master - Bug #7135: GC bug in Ruby 1.9.3-p194?https://bugs.ruby-lang.org/issues/7135?journal_id=302202012-10-11T07:53:13Znormalperson (Eric Wong)normalperson@yhbt.net
<ul></ul><p>"alexdowad (Alex Dowad)" <a href="mailto:alexinbeijing@gmail.com" class="email">alexinbeijing@gmail.com</a> wrote:</p>
<blockquote>
<p>Eric Wong <a href="mailto:normalperson@yhbt.net" class="email">normalperson@yhbt.net</a> wrote:</p>
<blockquote>
<p>I'm not familiar with Prawn, but does any of its dependencies pull in<br>
extra C extension which may have memory corruption bugs?</p>
</blockquote>
<p>No. The core team has committed to <em>never</em> using binary gems, only<br>
pure Ruby.</p>
</blockquote>
<p>That's good to hear.</p>
<blockquote>
<blockquote>
<p>Can you share your work-in-progress changes to Prawn?</p>
</blockquote>
<p>Do you really want to spend a few hours or days of your life helping<br>
to track down an obscure memory bug? If so, I'll push the code I am<br>
working on to GitHub.</p>
</blockquote>
<p>Me specifically? No, at least not yet :)</p>
<p>However, there's always a chance somebody can help you if given the<br>
right information.</p>
<p>Since this doesn't seem to be caused by a buggy C extension, perhaps it<br>
is a bug which can affect non-Prawn users, too. In that case, more<br>
people will be willing to help you track this problem down.</p> Ruby master - Bug #7135: GC bug in Ruby 1.9.3-p194?https://bugs.ruby-lang.org/issues/7135?journal_id=302212012-10-11T08:05:35Zalexdowad (Alex Dowad)alexinbeijing@gmail.com
<ul></ul><p>I patched the Ruby interpreter source and recompiled, but I'm having trouble using the resulting binary. The problem is that I don't want to run "make install", because I don't want to mess up my system's setup. But without "make install" a lot of stuff doesn't seem to work. Any suggestions on how to compile and use a patched version of Ruby <em>without</em> installing it as the system-wide default Ruby?</p> Ruby master - Bug #7135: GC bug in Ruby 1.9.3-p194?https://bugs.ruby-lang.org/issues/7135?journal_id=302222012-10-11T08:32:21Zalexdowad (Alex Dowad)alexinbeijing@gmail.com
<ul></ul><p>OK, I overcame the problem with compiling and testing a patched Ruby binary. When I comment out line 6229 of string.c, the problem goes away. Then when I uncomment the line and recompile Ruby, the problem comes back again. This is the code (it's for String#codepoints):</p>
<p>static VALUE<br>
rb_str_each_codepoint(VALUE str)<br>
{<br>
VALUE orig = str;<br>
int n;<br>
unsigned int c;<br>
const char *ptr, *end;<br>
rb_encoding *enc;</p>
<pre><code>if (single_byte_optimizable(str)) return rb_str_each_byte(str);
RETURN_ENUMERATOR(str, 0, 0);
str = rb_str_new4(str); /* I think problem is here */
ptr = RSTRING_PTR(str);
end = RSTRING_END(str);
enc = STR_ENC_GET(str);
while (ptr < end) {
c = rb_enc_codepoint_len(ptr, end, &n, enc);
rb_yield(UINT2NUM(c));
ptr += n;
}
return orig;
</code></pre>
<p>}</p>
<p>Line 6229 copies the String, so that #codepoints won't get messed up if someone modifies it while iterating.</p> Ruby master - Bug #7135: GC bug in Ruby 1.9.3-p194?https://bugs.ruby-lang.org/issues/7135?journal_id=302232012-10-11T10:08:47Zalexdowad (Alex Dowad)alexinbeijing@gmail.com
<ul></ul><p>OK, I have established beyond all doubt that the contents of the String <em>are</em> being overwritten -- <em>not</em> the original String, but the frozen copy which #codepoints makes internally. Additionally, the overwriting definitely happens when the GC runs.</p>
<p>Question: does the Ruby GC look for Object references by scanning the C-level stack?</p> Ruby master - Bug #7135: GC bug in Ruby 1.9.3-p194?https://bugs.ruby-lang.org/issues/7135?journal_id=302242012-10-11T10:23:18Znobu (Nobuyoshi Nakada)nobu@ruby-lang.org
<ul></ul><p>Hi,</p>
<p>At Thu, 11 Oct 2012 08:32:23 +0900,<br>
alexdowad (Alex Dowad) wrote in <a href="/issues/7135">[ruby-core:47897]</a>:</p>
<blockquote>
<p>OK, I overcame the problem with compiling and testing a<br>
patched Ruby binary. When I comment out line 6229 of<br>
string.c, the problem goes away. Then when I uncomment the<br>
line and recompile Ruby, the problem comes back again. This<br>
is the code (it's for String#codepoints):</p>
</blockquote>
<p>Thank you. Could you try this patch?</p>
<p><br>
diff --git i/string.c w/string.c<br>
index 9281e4c..6707c4b 100644<br>
--- i/string.c<br>
+++ w/string.c<br>
@@ -6332,6 +6332,7 @@ rb_str_each_codepoint(VALUE str)<br>
rb_yield(UINT2NUM(c));<br>
ptr += n;<br>
}</p>
<ul>
<li>RB_GC_GUARD(str);<br>
return orig;<br>
}</li>
</ul>
<p></p>
<p>--<br>
Nobu Nakada</p> Ruby master - Bug #7135: GC bug in Ruby 1.9.3-p194?https://bugs.ruby-lang.org/issues/7135?journal_id=302252012-10-11T10:29:28Znobu (Nobuyoshi Nakada)nobu@ruby-lang.org
<ul></ul><p>Hi,</p>
<p>At Thu, 11 Oct 2012 10:08:51 +0900,<br>
alexdowad (Alex Dowad) wrote in <a href="/issues/7135">[ruby-core:47898]</a>:</p>
<blockquote>
<p>Question: does the Ruby GC look for Object references by<br>
scanning the stack?</p>
</blockquote>
<p>Of course yes, but recent compilers often optimize out those<br>
variables.</p>
<p>--<br>
Nobu Nakada</p> Ruby master - Bug #7135: GC bug in Ruby 1.9.3-p194?https://bugs.ruby-lang.org/issues/7135?journal_id=302262012-10-11T10:30:12Zalexdowad (Alex Dowad)alexinbeijing@gmail.com
<ul></ul><p>YEE-HA!!! I think I may have nailed it!!!</p>
<p>I believe that my compiler was storing the pointer to the frozen string copy in a register, rather than on the stack, so the garbage collector couldn't find any references to it. But even after the frozen copy was GC'd, #codepoints still had C pointers into the middle of its data. After creating a new "volatile" local, and storing the pointer in there, the problem hasn't occurred again.</p>
<p>I'm going to do more testing to try to confirm that this is true. If it is, I'll submit a patch for the interpreter. Question: can I submit patches through GitHub pull requests? Or is it necessary to use svn?</p> Ruby master - Bug #7135: GC bug in Ruby 1.9.3-p194?https://bugs.ruby-lang.org/issues/7135?journal_id=302272012-10-11T10:33:13Zalexdowad (Alex Dowad)alexinbeijing@gmail.com
<ul></ul><p>Hi Nobuさん,</p>
<p>I just saw your messages after posting. Has the patch you showed already been applied to edge Ruby?</p> Ruby master - Bug #7135: GC bug in Ruby 1.9.3-p194?https://bugs.ruby-lang.org/issues/7135?journal_id=302282012-10-11T10:53:19Znobu (Nobuyoshi Nakada)nobu@ruby-lang.org
<ul></ul><p>Hi,</p>
<p>At Thu, 11 Oct 2012 10:33:16 +0900,<br>
alexdowad (Alex Dowad) wrote in <a href="/issues/7135">[ruby-core:47902]</a>:</p>
<blockquote>
<p>I just saw your messages after posting. Has the patch you<br>
showed already been applied to edge Ruby?</p>
</blockquote>
<p>Not yet. I'll apply it if it fixes the bug.</p>
<p>--<br>
Nobu Nakada</p> Ruby master - Bug #7135: GC bug in Ruby 1.9.3-p194?https://bugs.ruby-lang.org/issues/7135?journal_id=302292012-10-11T10:53:43Zalexdowad (Alex Dowad)alexinbeijing@gmail.com
<ul></ul><blockquote>
<ul>
<li>RB_GC_GUARD(str);</li>
</ul>
</blockquote>
<p>This also fixes the problem. I looked on GitHub, and it looks like this patch hasn't been applied to the newest version of the Ruby source... I'll submit a pull request.</p>
<p>Thanks to <a class="user active user-mention" href="https://bugs.ruby-lang.org/users/4">@nobu (Nobuyoshi Nakada)</a> and <a class="user active user-mention" href="https://bugs.ruby-lang.org/users/724">@normalperson (Eric Wong)</a> for your help!!!</p> Ruby master - Bug #7135: GC bug in Ruby 1.9.3-p194?https://bugs.ruby-lang.org/issues/7135?journal_id=302302012-10-11T11:02:57Zalexdowad (Alex Dowad)alexinbeijing@gmail.com
<ul></ul><blockquote>
<p>Not yet. I'll apply it if it fixes the bug.</p>
</blockquote>
<p>I'd prefer to submit my own PR, if it's OK with you. It would somehow make staying up until 4AM to debug this problem seem worthwhile...</p> Ruby master - Bug #7135: GC bug in Ruby 1.9.3-p194?https://bugs.ruby-lang.org/issues/7135?journal_id=302312012-10-11T11:29:13Znobu (Nobuyoshi Nakada)nobu@ruby-lang.org
<ul></ul><p>Hi,</p>
<p>At Thu, 11 Oct 2012 11:03:31 +0900,<br>
alexdowad (Alex Dowad) wrote in <a href="/issues/7135">[ruby-core:47905]</a>:</p>
<blockquote>
<p>I'd prefer to submit my own PR, if it's OK with you. It would<br>
somehow make staying up until 4AM to debug this problem seem<br>
worthwhile...</p>
</blockquote>
<p>I see. In which timezone 4AM?</p>
<p>--<br>
Nobu Nakada</p> Ruby master - Bug #7135: GC bug in Ruby 1.9.3-p194?https://bugs.ruby-lang.org/issues/7135?journal_id=302382012-10-11T16:20:50Zalexdowad (Alex Dowad)alexinbeijing@gmail.com
<ul></ul><blockquote>
<p>I see. In which timezone 4AM?</p>
</blockquote>
<p>That's 4AM Zambian time... right now I'm serving as a volunteer in Zambia (and doing Rails-related consulting on the side to support myself and my wife).</p> Ruby master - Bug #7135: GC bug in Ruby 1.9.3-p194?https://bugs.ruby-lang.org/issues/7135?journal_id=302392012-10-11T17:19:31Zalexdowad (Alex Dowad)alexinbeijing@gmail.com
<ul></ul><p>Just sent PR. <a href="https://github.com/ruby/ruby/pull/191" class="external">https://github.com/ruby/ruby/pull/191</a></p> Ruby master - Bug #7135: GC bug in Ruby 1.9.3-p194?https://bugs.ruby-lang.org/issues/7135?journal_id=303102012-10-11T23:09:51Znobu (Nobuyoshi Nakada)nobu@ruby-lang.org
<ul><li><strong>Status</strong> changed from <i>Feedback</i> to <i>Closed</i></li><li><strong>% Done</strong> changed from <i>0</i> to <i>100</i></li></ul><p>This issue was solved with changeset r37143.<br>
Alex, thank you for reporting this issue.<br>
Your contribution to Ruby is greatly appreciated.<br>
May Ruby be with you.</p>
<hr>
<p>string.c: GC guard</p>
<ul>
<li>string.c (rb_str_sub{seq,pos,str}, rb_str_each_{line,codepoint}):<br>
prevent String copies from GC. <a href="/issues/7135">[ruby-core:47881]</a> [Bug <a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: GC bug in Ruby 1.9.3-p194? (Closed)" href="https://bugs.ruby-lang.org/issues/7135">#7135</a>]</li>
</ul>