https://bugs.ruby-lang.org/https://bugs.ruby-lang.org/favicon.ico?17113305112021-07-29T23:32:59ZRuby Issue Tracking SystemRuby master - Bug #18053: Crashes and infinite loops when generating partial backtraces in Ruby 3.0+https://bugs.ruby-lang.org/issues/18053?journal_id=930582021-07-29T23:32:59Zjeremyevans0 (Jeremy Evans)merch-redmine@jeremyevans.net
<ul></ul><p>Thanks for the report. I can confirm both of the issues. I checked with my rewrite of the backtrace generation algorithm (<a href="https://github.com/ruby/ruby/pull/4671" class="external">https://github.com/ruby/ruby/pull/4671</a>), and confirmed that it fixes both of these issues, in addition to a separate issue I identified. The new algorithm works from the most recent frame towards the oldest frame instead of the oldest frame to the most current frame, which makes more sense for partial backtraces, and is about twice as fast.</p>
<p>As to what to do about 3.0, I can try to fix this, or we can revert the optimization. After the previous bugfixes (see <a href="https://github.com/ruby/ruby/pull/4120" class="external">https://github.com/ruby/ruby/pull/4120</a> and <a href="https://github.com/ruby/ruby/pull/4237" class="external">https://github.com/ruby/ruby/pull/4237</a>), the optimization is very complex, and would probably need to be made more complex to fix this and the other issue I identified. The new algorithm, in addition to being faster, is much simpler.</p>
<p>At this point, I think reverting the optimization in 3.0 is the best way forward. We can switch to the new algorithm in 3.1. However, if people would like to keep the optimization in 3.0, I can work on fixing it.</p> Ruby master - Bug #18053: Crashes and infinite loops when generating partial backtraces in Ruby 3.0+https://bugs.ruby-lang.org/issues/18053?journal_id=930592021-07-29T23:41:01Zjhawthorn (John Hawthorn)
<ul></ul><p>jeremyevans0 (Jeremy Evans) wrote in <a href="#note-1">#note-1</a>:</p>
<blockquote>
<p>At this point, I think reverting the optimization in 3.0 is the best way forward. We can switch to the new algorithm in 3.1. However, if people would like to keep the optimization in 3.0, I can work on fixing it.</p>
</blockquote>
<p>Thanks for taking a look! Reverting would be fine for us.</p> Ruby master - Bug #18053: Crashes and infinite loops when generating partial backtraces in Ruby 3.0+https://bugs.ruby-lang.org/issues/18053?journal_id=931492021-08-06T17:15:20Zjeremyevans (Jeremy Evans)code@jeremyevans.net
<ul><li><strong>Status</strong> changed from <i>Open</i> to <i>Closed</i></li></ul><p>Applied in changeset <a class="changeset" title="Make backtrace generation work outward from current frame This fixes multiple bugs found in the ..." href="https://bugs.ruby-lang.org/projects/ruby-master/repository/git/revisions/1a05dc03f953830564c272665c47a61e53550f3e">git|1a05dc03f953830564c272665c47a61e53550f3e</a>.</p>
<hr>
<p>Make backtrace generation work outward from current frame</p>
<p>This fixes multiple bugs found in the partial backtrace<br>
optimization added in 3b24b7914c16930bfadc89d6aff6326a51c54295.<br>
These bugs occurs when passing a start argument to caller where<br>
the start argument lands on a iseq frame without a pc.</p>
<p>Before this commit, the following code results in the same<br>
line being printed twice, both for the #each method.</p>
<pre><code class="ruby syntaxhl" data-language="ruby"><span class="k">def</span> <span class="nf">a</span><span class="p">;</span> <span class="p">[</span><span class="mi">1</span><span class="p">].</span><span class="nf">group_by</span> <span class="p">{</span> <span class="n">b</span> <span class="p">}</span> <span class="k">end</span>
<span class="k">def</span> <span class="nf">b</span><span class="p">;</span> <span class="nb">puts</span><span class="p">(</span><span class="nb">caller</span><span class="p">(</span><span class="mi">2</span><span class="p">,</span> <span class="mi">1</span><span class="p">).</span><span class="nf">first</span><span class="p">,</span> <span class="nb">caller</span><span class="p">(</span><span class="mi">3</span><span class="p">,</span> <span class="mi">1</span><span class="p">).</span><span class="nf">first</span><span class="p">)</span> <span class="k">end</span>
<span class="n">a</span>
</code></pre>
<p>After this commit and in Ruby 2.7, the lines are different,<br>
with the first line being for each and the second for group_by.</p>
<p>Before this commit, the following code can either segfault or<br>
result in an infinite loop:</p>
<pre><code class="ruby syntaxhl" data-language="ruby"><span class="k">def</span> <span class="nf">foo</span>
<span class="n">caller_locations</span><span class="p">(</span><span class="mi">2</span><span class="p">,</span> <span class="mi">1</span><span class="p">).</span><span class="nf">inspect</span> <span class="c1"># segfault</span>
<span class="n">caller_locations</span><span class="p">(</span><span class="mi">2</span><span class="p">,</span> <span class="mi">1</span><span class="p">)[</span><span class="mi">0</span><span class="p">].</span><span class="nf">path</span> <span class="c1"># infinite loop</span>
<span class="k">end</span>
<span class="mi">1</span><span class="p">.</span><span class="nf">times</span><span class="p">.</span><span class="nf">map</span> <span class="p">{</span> <span class="mi">1</span><span class="p">.</span><span class="nf">times</span><span class="p">.</span><span class="nf">map</span> <span class="p">{</span> <span class="n">foo</span> <span class="p">}</span> <span class="p">}</span>
</code></pre>
<p>After this commit, this code works correctly.</p>
<p>This commit completely refactors the backtrace handling.<br>
Instead of processing the backtrace from the outermost<br>
frame working in, process it from the innermost frame<br>
working out. This is much faster for partial backtraces,<br>
since you only access the control frames you need to in<br>
order to construct the backtrace.</p>
<p>To handle cfunc frames in the new design, they start<br>
out with no location information. We increment a counter<br>
for each cfunc frame added. When an iseq frame with pc<br>
is accessed, after adding the iseq backtrace location,<br>
we use the location for the iseq backtrace location for<br>
all of the directly preceding cfunc backtrace locations.</p>
<p>If the last backtrace line is a cfunc frame, we continue<br>
scanning for iseq frames until the end control frame, and<br>
use the location information from the first one for the<br>
trailing cfunc frames in the backtrace.</p>
<p>As only rb_ec_partial_backtrace_object uses the new<br>
backtrace implementation, remove all of the function<br>
pointers and inline the functions. This makes the<br>
process easier to understand.</p>
<p>Restore the Ruby 2.7 implementation of backtrace_each and<br>
use it for all the other functions that called<br>
backtrace_each other than rb_ec_partial_backtrace_object.<br>
All other cases requested the entire backtrace, so there<br>
is no advantage of using the new algorithm for those.<br>
Additionally, there are implicit assumptions in the other<br>
code that the backtrace processing works inward instead<br>
of outward.</p>
<p>Remove the cfunc/iseq union in rb_backtrace_location_t,<br>
and remove the prev_loc member for cfunc. Both cfunc and<br>
iseq types can now have iseq and pc entries, so the<br>
location information can be accessed the same way for each.<br>
This avoids the need for a extra backtrace location entry<br>
to store an iseq backtrace location if the final entry in<br>
the backtrace is a cfunc. This is also what fixes the<br>
segfault and infinite loop issues in the above bugs.</p>
<p>Here's Ruby pseudocode for the new algorithm, where start<br>
and length are the arguments to caller or caller_locations:</p>
<pre><code class="ruby syntaxhl" data-language="ruby"><span class="n">end_cf</span> <span class="o">=</span> <span class="no">VM</span><span class="p">.</span><span class="nf">end_control_frame</span><span class="p">.</span><span class="nf">next</span>
<span class="n">cf</span> <span class="o">=</span> <span class="no">VM</span><span class="p">.</span><span class="nf">start_control_frame</span>
<span class="n">size</span> <span class="o">=</span> <span class="no">VM</span><span class="p">.</span><span class="nf">num_control_frames</span> <span class="o">-</span> <span class="mi">2</span>
<span class="n">bt</span> <span class="o">=</span> <span class="p">[]</span>
<span class="n">cfunc_counter</span> <span class="o">=</span> <span class="mi">0</span>
<span class="k">if</span> <span class="n">length</span><span class="p">.</span><span class="nf">nil?</span> <span class="o">||</span> <span class="n">length</span> <span class="o">></span> <span class="n">size</span>
<span class="n">length</span> <span class="o">=</span> <span class="n">size</span>
<span class="k">end</span>
<span class="k">while</span> <span class="n">cf</span> <span class="o">!=</span> <span class="n">end_cf</span> <span class="o">&&</span> <span class="n">bt</span><span class="p">.</span><span class="nf">size</span> <span class="o">!=</span> <span class="n">length</span>
<span class="k">if</span> <span class="n">cf</span><span class="p">.</span><span class="nf">iseq?</span>
<span class="k">if</span> <span class="n">cf</span><span class="p">.</span><span class="nf">instruction_pointer?</span>
<span class="k">if</span> <span class="n">start</span> <span class="o">></span> <span class="mi">0</span>
<span class="n">start</span> <span class="o">-=</span> <span class="mi">1</span>
<span class="k">else</span>
<span class="n">bt</span> <span class="o"><<</span> <span class="n">cf</span><span class="p">.</span><span class="nf">iseq_backtrace_entry</span>
<span class="n">cf_counter</span><span class="p">.</span><span class="nf">times</span> <span class="k">do</span> <span class="o">|</span><span class="n">i</span><span class="o">|</span>
<span class="n">bt</span><span class="p">[</span><span class="o">-</span><span class="mi">1</span> <span class="o">-</span> <span class="n">i</span><span class="p">].</span><span class="nf">loc</span> <span class="o">=</span> <span class="n">cf</span><span class="p">.</span><span class="nf">loc</span>
<span class="k">end</span>
<span class="n">cfunc_counter</span> <span class="o">=</span> <span class="mi">0</span>
<span class="k">end</span>
<span class="k">end</span>
<span class="k">elsif</span> <span class="n">cf</span><span class="p">.</span><span class="nf">cfunc?</span>
<span class="k">if</span> <span class="n">start</span> <span class="o">></span> <span class="mi">0</span>
<span class="n">start</span> <span class="o">-=</span> <span class="mi">1</span>
<span class="k">else</span>
<span class="n">bt</span> <span class="o"><<</span> <span class="n">cf</span><span class="p">.</span><span class="nf">cfunc_backtrace_entry</span>
<span class="n">cfunc_counter</span> <span class="o">+=</span> <span class="mi">1</span>
<span class="k">end</span>
<span class="k">end</span>
<span class="n">cf</span> <span class="o">=</span> <span class="n">cf</span><span class="p">.</span><span class="nf">prev</span>
<span class="k">end</span>
<span class="k">if</span> <span class="n">cfunc_counter</span> <span class="o">></span> <span class="mi">0</span>
<span class="k">while</span> <span class="n">cf</span> <span class="o">!=</span> <span class="n">end_cf</span>
<span class="k">if</span> <span class="p">(</span><span class="n">cf</span><span class="p">.</span><span class="nf">iseq?</span> <span class="o">&&</span> <span class="n">cf</span><span class="p">.</span><span class="nf">instruction_pointer?</span><span class="p">)</span>
<span class="n">cf_counter</span><span class="p">.</span><span class="nf">times</span> <span class="k">do</span> <span class="o">|</span><span class="n">i</span><span class="o">|</span>
<span class="n">bt</span><span class="p">[</span><span class="o">-</span><span class="n">i</span><span class="p">].</span><span class="nf">loc</span> <span class="o">=</span> <span class="n">cf</span><span class="p">.</span><span class="nf">loc</span>
<span class="k">end</span>
<span class="k">end</span>
<span class="n">cf</span> <span class="o">=</span> <span class="n">cf</span><span class="p">.</span><span class="nf">prev</span>
<span class="k">end</span>
<span class="k">end</span>
</code></pre>
<p>With the following benchmark, which uses a call depth of<br>
around 100 (common in many Ruby applications):</p>
<pre><code class="ruby syntaxhl" data-language="ruby"><span class="k">class</span> <span class="nc">T</span>
<span class="k">def</span> <span class="nf">test</span><span class="p">(</span><span class="n">depth</span><span class="p">,</span> <span class="o">&</span><span class="n">block</span><span class="p">)</span>
<span class="k">if</span> <span class="n">depth</span> <span class="o">==</span> <span class="mi">0</span>
<span class="k">yield</span> <span class="nb">self</span>
<span class="k">else</span>
<span class="nb">test</span><span class="p">(</span><span class="n">depth</span> <span class="o">-</span> <span class="mi">1</span><span class="p">,</span> <span class="o">&</span><span class="n">block</span><span class="p">)</span>
<span class="k">end</span>
<span class="k">end</span>
<span class="k">def</span> <span class="nf">array</span>
<span class="no">Array</span><span class="p">.</span><span class="nf">new</span>
<span class="k">end</span>
<span class="k">def</span> <span class="nf">first</span>
<span class="n">caller_locations</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="mi">1</span><span class="p">)</span>
<span class="k">end</span>
<span class="k">def</span> <span class="nf">full</span>
<span class="n">caller_locations</span>
<span class="k">end</span>
<span class="k">end</span>
<span class="n">t</span> <span class="o">=</span> <span class="no">T</span><span class="p">.</span><span class="nf">new</span>
<span class="n">t</span><span class="p">.</span><span class="nf">test</span><span class="p">((</span><span class="no">ARGV</span><span class="p">.</span><span class="nf">first</span> <span class="o">||</span> <span class="mi">100</span><span class="p">).</span><span class="nf">to_i</span><span class="p">)</span> <span class="k">do</span>
<span class="no">Benchmark</span><span class="p">.</span><span class="nf">ips</span> <span class="k">do</span> <span class="o">|</span><span class="n">x</span><span class="o">|</span>
<span class="n">x</span><span class="p">.</span><span class="nf">report</span> <span class="p">(</span><span class="s1">'caller_loc(1, 1)'</span><span class="p">)</span> <span class="p">{</span><span class="n">t</span><span class="p">.</span><span class="nf">first</span><span class="p">}</span>
<span class="n">x</span><span class="p">.</span><span class="nf">report</span> <span class="p">(</span><span class="s1">'caller_loc'</span><span class="p">)</span> <span class="p">{</span><span class="n">t</span><span class="p">.</span><span class="nf">full</span><span class="p">}</span>
<span class="n">x</span><span class="p">.</span><span class="nf">report</span> <span class="p">(</span><span class="s1">'Array.new'</span><span class="p">)</span> <span class="p">{</span><span class="n">t</span><span class="p">.</span><span class="nf">array</span><span class="p">}</span>
<span class="n">x</span><span class="p">.</span><span class="nf">compare!</span>
<span class="k">end</span>
<span class="k">end</span>
</code></pre>
<p>Results before commit:</p>
<pre><code>Calculating -------------------------------------
caller_loc(1, 1) 281.159k (_ 0.7%) i/s - 1.426M in 5.073055s
caller_loc 15.836k (_ 2.1%) i/s - 79.450k in 5.019426s
Array.new 1.852M (_ 2.5%) i/s - 9.296M in 5.022511s
Comparison:
Array.new: 1852297.5 i/s
caller_loc(1, 1): 281159.1 i/s - 6.59x (_ 0.00) slower
caller_loc: 15835.9 i/s - 116.97x (_ 0.00) slower
</code></pre>
<p>Results after commit:</p>
<pre><code>Calculating -------------------------------------
caller_loc(1, 1) 562.286k (_ 0.8%) i/s - 2.858M in 5.083249s
caller_loc 16.402k (_ 1.0%) i/s - 83.200k in 5.072963s
Array.new 1.853M (_ 0.1%) i/s - 9.278M in 5.007523s
Comparison:
Array.new: 1852776.5 i/s
caller_loc(1, 1): 562285.6 i/s - 3.30x (_ 0.00) slower
caller_loc: 16402.3 i/s - 112.96x (_ 0.00) slower
</code></pre>
<p>This shows that the speed of caller_locations(1, 1) has roughly<br>
doubled, and the speed of caller_locations with no arguments<br>
has improved slightly. So this new algorithm is significant faster,<br>
much simpler, and fixes bugs in the previous algorithm.</p>
<p>Fixes [Bug <a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: Crashes and infinite loops when generating partial backtraces in Ruby 3.0+ (Closed)" href="https://bugs.ruby-lang.org/issues/18053">#18053</a>]</p> Ruby master - Bug #18053: Crashes and infinite loops when generating partial backtraces in Ruby 3.0+https://bugs.ruby-lang.org/issues/18053?journal_id=931502021-08-06T17:28:51Zjeremyevans0 (Jeremy Evans)merch-redmine@jeremyevans.net
<ul><li><strong>Backport</strong> changed from <i>2.6: UNKNOWN, 2.7: UNKNOWN, 3.0: UNKNOWN</i> to <i>2.6: DONTNEED, 2.7: DONTNEED, 3.0: REQUIRED</i></li></ul><p>I committed my rewrite of backtrace generation (<a class="changeset" title="Make backtrace generation work outward from current frame This fixes multiple bugs found in the ..." href="https://bugs.ruby-lang.org/projects/ruby-master/repository/git/revisions/1a05dc03f953830564c272665c47a61e53550f3e">1a05dc03f953830564c272665c47a61e53550f3e</a>), which fixes these issues in master/3.1. I took the examples from this issue and added them as tests in that commit.</p>
<p>For Ruby 3.0, there are a few options:</p>
<ol>
<li>Backout partial backtrace optimization</li>
<li>Merge rewrite of backtrace generation</li>
<li>Fix partial backtrace optimization.</li>
</ol>
<p>The issue with option 1 is that <a class="changeset" title="Ignore <internal: entries from core library methods for Kernel#warn(message, uplevel: n) * Fixes..." href="https://bugs.ruby-lang.org/projects/ruby-master/repository/git/revisions/cffdacb15a363321e1c1879aa7d94924acafd1cf">cffdacb15a363321e1c1879aa7d94924acafd1cf</a> was merged after the optimization was committed, and it is nontrivial to backout the partial backtrace optimization and still keep that feature. I tried, but my attempt could not pass CI. Additionally, even if the issues with the backout could be fixed, using the backout approach would make partial backtraces slower.</p>
<p>The issue with option 2 is that <a class="changeset" title="Make backtrace generation work outward from current frame This fixes multiple bugs found in the ..." href="https://bugs.ruby-lang.org/projects/ruby-master/repository/git/revisions/1a05dc03f953830564c272665c47a61e53550f3e">1a05dc03f953830564c272665c47a61e53550f3e</a> is fairly invasive. I'm fairly sure it would apply cleanly to 3.0, but I'm not sure if <a class="user active user-mention" href="https://bugs.ruby-lang.org/users/404">@nagachika (Tomoyuki Chikanaga)</a> would be comfortable backporting such a large change.</p>
<p>So I worked on implementing option 3. It took many hours to make the new test cases pass without breaking existing test cases, but I eventually did get it to work. The way I got it to work was using a similar strategy as was used for <a class="changeset" title="Make backtrace generation work outward from current frame This fixes multiple bugs found in the ..." href="https://bugs.ruby-lang.org/projects/ruby-master/repository/git/revisions/1a05dc03f953830564c272665c47a61e53550f3e">1a05dc03f953830564c272665c47a61e53550f3e</a>. Basically, scan the backtrace from the current frame toward the earliest frame instead of from the earlier frame to the current frame. When we come to the frame that will complete the partial backtrace, we use that as the starting frame for backtrace generation, and backtrace generation still works from earlier frames towards the current frame. I submitted a pull request for this option: <a href="https://github.com/ruby/ruby/pull/4714" class="external">https://github.com/ruby/ruby/pull/4714</a></p> Ruby master - Bug #18053: Crashes and infinite loops when generating partial backtraces in Ruby 3.0+https://bugs.ruby-lang.org/issues/18053?journal_id=931762021-08-08T03:35:06Znagachika (Tomoyuki Chikanaga)nagachika00@gmail.com
<ul></ul><p>Hello Jeremy,<br>
Thank you for your great work on the partial backtrace optimization and deep understanding for the stable branch maintenance. I'd like to adopt your 3rd option. I will merge your PR after confirming the CI passed with it.</p> Ruby master - Bug #18053: Crashes and infinite loops when generating partial backtraces in Ruby 3.0+https://bugs.ruby-lang.org/issues/18053?journal_id=931772021-08-08T04:00:39Znagachika (Tomoyuki Chikanaga)nagachika00@gmail.com
<ul><li><strong>Backport</strong> changed from <i>2.6: DONTNEED, 2.7: DONTNEED, 3.0: REQUIRED</i> to <i>2.6: DONTNEED, 2.7: DONTNEED, 3.0: DONE</i></li></ul><p>ruby_3_0 c07545bbf82068f4fd92a5ccfa2b09bb96b39adb.</p> Ruby master - Bug #18053: Crashes and infinite loops when generating partial backtraces in Ruby 3.0+https://bugs.ruby-lang.org/issues/18053?journal_id=932492021-08-11T17:58:29Zjhawthorn (John Hawthorn)
<ul></ul><p>Thanks Jeremy! Just confirmed that the fix in ruby_3_0 solves all the crashes and hangs we were seeing.</p>