Project

General

Profile

Bug #20286

Updated by ioquatix (Samuel Williams) 10 months ago

Using TracePoint to trace `thread_begin` and `thread_end` events fails to emit the `thread_end` event when an exception (e.g., Interrupt) is raised within a thread. This behavior occurs because the exception handling bypasses the internal thread finishing logic, including trace point and fiber scheduler cleanup code. This issue affects the ability to accurately monitor thread lifecycle events in scenarios involving exception handling or abrupt thread terminations. 

 ## Steps to Reproduce: 

 1. Set up `TracePoint` to trace `thread_begin` and `thread_end` events. 
 2. Create a new thread that raises an exception. 
 3. Join the thread and observe that only the `thread_begin` event is emitted without a corresponding `thread_end` event. 

 ## Example Code Expected Behavior: 

 ```ruby 
 TracePoint.trace(:thread_begin, :thread_end) do |tp| 
   p [tp.event, tp.lineno, tp.path, tp.defined_class, tp.method_id] 
 end The `TracePoint` should emit both `thread_begin` and `thread_end` events, accurately reflecting the lifecycle of the thread, even when an exception is raised within the thread. 

 thread = Thread.new do 
   raise Interrupt 
 end 

 thread.join 
 ``` 

 ### Current ## Actual Behavior: 

 The `TracePoint` emits the `thread_begin` event but fails to emit the `thread_end` event when an exception is raised within the thread, indicating an incomplete tracing of thread lifecycle events. 

 I've confirmed this as far back as Ruby 2.6. 

 ``` ## Example Code 

 ```ruby 
 > ruby ./test.rb TracePoint.trace(:thread_begin, :thread_end) do |tp| 
   p [tp.event, tp.lineno, tp.path, tp.defined_class, tp.method_id] 
 [:thread_begin, 0, nil, nil, nil] 
 #<Thread:0x000000010384b5a8 ./test.rb:5 run> terminated with exception (report_on_exception is true): 
 ./test.rb:6:in `block in <main>': end 

 thread = Thread.new do 
   raise Interrupt (Interrupt) 
 ./test.rb:6:in `block in <main>': Interrupt (Interrupt) 
 ``` end 

 ### Expected Behavior: 

 The `TracePoint` should emit both `thread_begin` and `thread_end` events, accurately reflecting the lifecycle of the thread, even when an exception is raised within the thread. 

 ``` 
 > ruby ./test.rb 
 [:thread_begin, 0, nil, nil, nil] 
 [:thread_end, 0, nil, nil, nil] 
 #<Thread:0x0000000105435db8 ./test.rb:5 run> terminated with exception (report_on_exception is true): 
 ./test.rb:6:in 'block in <main>': Interrupt (Interrupt) 
 ./test.rb:6:in 'block in <main>': Interrupt (Interrupt) thread.join 
 ``` 

 ## Possible Fix 

 Changing the implementation of `thread_do_start` to have what amounts to an "ensure" block. 

 ```c 
 static void 
 thread_do_start(rb_thread_t *th) 
 { 
     native_set_thread_name(th); 
     VALUE result = Qundef; 

     rb_execution_context_t *ec = th->ec; 
     int state; 

     EXEC_EVENT_HOOK(ec, RUBY_EVENT_THREAD_BEGIN, th->self, 0, 0, 0, Qundef); 

     EC_PUSH_TAG(ec); 
     if ((state = EC_EXEC_TAG()) == TAG_NONE) { 
         switch (th->invoke_type) { 
         case thread_invoke_type_proc: 
             result = thread_do_start_proc(th); 
             break; 

         case thread_invoke_type_ractor_proc: 
             result = thread_do_start_proc(th); 
             rb_ractor_atexit(th->ec, result); 
             break; 

         case thread_invoke_type_func: 
             result = (*th->invoke_arg.func.func)(th->invoke_arg.func.arg); 
             break; 

         case thread_invoke_type_none: 
             rb_bug("unreachable"); 
         } 
     } 

     EC_POP_TAG(); 
     VALUE errinfo = ec->errinfo; 

     if (!NIL_P(errinfo) && !RB_TYPE_P(errinfo, T_OBJECT)) { 
         ec->errinfo = Qnil; 
     } 

     rb_fiber_scheduler_set(Qnil); 
     EXEC_EVENT_HOOK(th->ec, RUBY_EVENT_THREAD_END, th->self, 0, 0, 0, Qundef); 

     ec->errinfo = errinfo; 

     if (state) 
         EC_JUMP_TAG(ec, state); 

     th->value = result; 
 } 
 ``` 

 It's possible `rb_fiber_scheduler_set(Qnil);` can emit an exception itself. How do we write the code to handle that case?

Back