Bug #21127
Updated by pocke (Masataka Kuwabara) 3 months ago
Previously, Ruby displayed backtraces for each thread on deadlock. However, it has not been shown since Ruby 3.0. It should display the backtrace for debugging. (related: https://bugs.ruby-lang.org/issues/8214) ## Step to reproduce ```ruby def f(q) g(q) end def g(q) h(q) end def h(q) q.pop end q = Thread::Queue.new th1 = Thread.new { f q } th2 = Thread.new { f q } th1.join th2.join ``` Ruby 2.7.8: It displays the trace. ``` ruby-2.7.8 ruby 2.7.8p225 (2023-03-30 revision 1f4d455848) [x86_64-linux] -e:9:in `join': No live threads left. Deadlock? (fatal) 3 threads, 3 sleeps current:0x0000004000653e80 main thread:0x000000400035da50 * #<Thread:0x000000400038ad70 sleep_forever> rb_thread_t:0x000000400035da50 native:0x0000004002fb7740 int:0 -e:9:in `join' -e:9:in `<main>' * #<Thread:0x00000040004fe760 -e:6 sleep_forever> rb_thread_t:0x00000040004518a0 native:0x00000040067ae700 int:0 depended by: tb_thread_id:0x000000400035da50 -e:3:in `pop' -e:3:in `h' -e:2:in `g' -e:1:in `f' -e:6:in `block in <main>' * #<Thread:0x00000040004fe620 -e:7 sleep_forever> rb_thread_t:0x0000004000653e80 native:0x000000400e9af700 int:0 -e:3:in `pop' -e:3:in `h' -e:2:in `g' -e:1:in `f' -e:7:in `block in <main>' from -e:9:in `<main>' ``` Ruby 3.0.0: There is no trace. ``` ruby-3.0.0 ruby 3.0.0p0 (2020-12-25 revision 95aff21468) [x86_64-linux] -e:9:in `join': No live threads left. Deadlock? (fatal) 3 threads, 3 sleeps current:0x00000040006e88c0 main thread:0x00000040003ffa10 * #<Thread:0x000000400046f448 sleep_forever> rb_thread_t:0x00000040003ffa10 native:0x0000004003059740 int:0 * #<Thread:0x000000400073c2e8 -e:6 sleep_forever> rb_thread_t:0x00000040006e15a0 native:0x0000004006850700 int:0 depended by: tb_thread_id:0x00000040003ffa10 * #<Thread:0x000000400073c1f8 -e:7 sleep_forever> rb_thread_t:0x00000040006e88c0 native:0x000000400ea51700 int:0 from -e:9:in `<main>' ``` Ruby 3.5.0.dev (latest): There is no trace. ``` ruby 3.5.0dev (2025-02-10T00:27:18Z master 135479a58d) +PRISM [arm64-darwin24] -e:9:in 'Thread#join': No live threads left. Deadlock? (fatal) 3 threads, 3 sleeps current:0x000000012ae3b530 main thread:0x000000012af04080 * #<Thread:0x0000000104aaaf78 sleep_forever> rb_thread_t:0x000000012af04080 native:0x00000001e9d93240 int:0 * #<Thread:0x0000000104e5e048 -e:6 sleep_forever> rb_thread_t:0x000000012ae23a90 native:0x000000016b5f3000 int:0 depended by: tb_thread_id:0x000000012af04080 * #<Thread:0x0000000104e5deb8 -e:7 sleep_forever> rb_thread_t:0x000000012ae3b530 native:0x000000016b6ff000 int:0 from -e:9:in '<main>' ``` Note: The results of 2.7 and 3.0 are indented because I executed them with all-ruby. ## Cause This commit has introduced this change. https://github.com/ruby/ruby/commit/3b24b7914c16930bfadc89d6aff6326a51c54295 The ticket is https://bugs.ruby-lang.org/issues/17031 It changes `rb_ec_backtrace_str_ary` function in `vm_backtrace.c`. Previously, `rb_ec_backtrace_str_ary(ec, 0, 0)` returns all traces, but it returns an empty array with this change. It should pass `ALL_BACKTRACE_LINES` (`-1`) to get all traces. I used rubyfarm also to confirm that this commit was the cause. With the commit, it does not display the traces. ``` docker run --rm -ti rubylang/rubyfarm:3b24b7914c16930bfadc89d6aff6326a51c54295 ruby -ve 'def f(q) g(q) end def g(q) h(q) end def h(q) q.pop end q = Thread::Queue.new th1 = Thread.new { f q } th2 = Thread.new { f q } th1.join th2.join' ruby 2.8.0dev (2020-08-27T22:17:36Z :detached: 3b24b7914c) [x86_64-linux] -e:9:in `join': No live threads left. Deadlock? (fatal) 3 threads, 3 sleeps current:0x0000004000566a80 main thread:0x0000004000202a60 * #<Thread:0x000000400026faf8 sleep_forever> rb_thread_t:0x0000004000202a60 native:0x000000400440d740 int:0 * #<Thread:0x00000040004f1768 -e:6 sleep_forever> rb_thread_t:0x0000004000566710 native:0x00000040081fe700 int:0 depended by: tb_thread_id:0x0000004000202a60 * #<Thread:0x00000040004f1600 -e:7 sleep_forever> rb_thread_t:0x0000004000566a80 native:0x00000040103ff700 int:0 from -e:9:in `<main>' ``` With the parent commit, it displays the traces. ``` docker run --rm -ti rubylang/rubyfarm:8095114f1715070fcdc2b29303dcf55a7fcc32a3 ruby -ve 'def f(q) g(q) end def g(q) h(q) end def h(q) q.pop end q = Thread::Queue.new th1 = Thread.new { f q } th2 = Thread.new { f q } th1.join th2.join' ruby 2.8.0dev (2020-08-27T19:54:36Z :detached: 8095114f17) [x86_64-linux] -e:9:in `join': No live threads left. Deadlock? (fatal) 3 threads, 3 sleeps current:0x000000400055b3a0 main thread:0x0000004000202a60 * #<Thread:0x000000400026faf8 sleep_forever> rb_thread_t:0x0000004000202a60 native:0x000000400440d740 int:0 -e:9:in `join' -e:9:in `<main>' * #<Thread:0x00000040004f1768 -e:6 sleep_forever> rb_thread_t:0x000000400055afe0 native:0x00000040081fe700 int:0 depended by: tb_thread_id:0x0000004000202a60 -e:3:in `pop' -e:3:in `h' -e:2:in `g' -e:1:in `f' -e:6:in `block in <main>' * #<Thread:0x00000040004f1600 -e:7 sleep_forever> rb_thread_t:0x000000400055b3a0 native:0x00000040103ff700 int:0 -e:3:in `pop' -e:3:in `h' -e:2:in `g' -e:1:in `f' -e:7:in `block in <main>' from -e:9:in `<main>' ``` I think it's not an intentional change. The ticket and NEWS do not mention the backtrace on a deadlock. (And the backtrace is really helpful!) ## Solution The following patch fixes the problem. PR: https://github.com/ruby/ruby/pull/12721 I'll make a PR on GitHub soon. ```diff diff --git a/eval_intern.h b/eval_intern.h index 9a551120ff..f783c93aff 100644 --- a/eval_intern.h +++ b/eval_intern.h @@ -300,6 +300,8 @@ void rb_vm_set_progname(VALUE filename); VALUE rb_vm_cbase(void); /* vm_backtrace.c */ +#define BACKTRACE_START 0 +#define ALL_BACKTRACE_LINES -1 VALUE rb_ec_backtrace_object(const rb_execution_context_t *ec); VALUE rb_ec_backtrace_str_ary(const rb_execution_context_t *ec, long lev, long n); VALUE rb_ec_backtrace_location_ary(const rb_execution_context_t *ec, long lev, long n, bool skip_internal); diff --git a/thread.c b/thread.c index d40d5140b7..5280c667af 100644 --- a/thread.c +++ b/thread.c @@ -5596,7 +5596,7 @@ debug_deadlock_check(rb_ractor_t *r, VALUE msg) } } rb_str_catf(msg, "\n "); - rb_str_concat(msg, rb_ary_join(rb_ec_backtrace_str_ary(th->ec, 0, 0), sep)); + rb_str_concat(msg, rb_ary_join(rb_ec_backtrace_str_ary(th->ec, BACKTRACE_START, ALL_BACKTRACE_LINES), sep)); rb_str_catf(msg, "\n"); } } diff --git a/vm_backtrace.c b/vm_backtrace.c index 56a0d861d6..6f2cf9e169 100644 --- a/vm_backtrace.c +++ b/vm_backtrace.c @@ -31,9 +31,6 @@ id2str(ID id) } #define rb_id2str(id) id2str(id) -#define BACKTRACE_START 0 -#define ALL_BACKTRACE_LINES -1 - inline static int calc_pos(const rb_iseq_t *iseq, const VALUE *pc, int *lineno, int *node_id) { ``` The result is here: ``` $ ruby -ve 'def f(q) g(q) end def g(q) h(q) end def h(q) q.pop end q = Thread::Queue.new th1 = Thread.new { f q } th2 = Thread.new { f q } th1.join th2.join' ruby 3.5.0dev (2025-02-10T00:27:18Z origin/master 135479a58d) +PRISM [arm64-darwin24] -e:9:in 'Thread#join': No live threads left. Deadlock? (fatal) 3 threads, 3 sleeps current:0x0000000157a053a0 main thread:0x0000000156704080 * #<Thread:0x000000011bcfaf78 sleep_forever> rb_thread_t:0x0000000156704080 native:0x00000001e9d93240 int:0 -e:9:in 'Thread#join' -e:9:in '<main>' * #<Thread:0x000000011d2d0bb8 -e:6 sleep_forever> rb_thread_t:0x0000000157a04d40 native:0x000000016dcb7000 int:0 depended by: tb_thread_id:0x0000000156704080 <internal:thread_sync>:18:in 'Thread::Queue#pop' -e:3:in 'Object#h' -e:2:in 'Object#g' -e:1:in 'Object#f' -e:6:in 'block in <main>' * #<Thread:0x000000011d2d07a8 -e:7 sleep_forever> rb_thread_t:0x0000000157a053a0 native:0x000000016ddc3000 int:0 <internal:thread_sync>:18:in 'Thread::Queue#pop' -e:3:in 'Object#h' -e:2:in 'Object#g' -e:1:in 'Object#f' -e:7:in 'block in <main>' from -e:9:in '<main>' ```