Project

General

Profile

Actions

Bug #17618

closed

Exceptions in Fiber Scheduler causes a segv

Added by tenderlovemaking (Aaron Patterson) 5 months ago. Updated 9 days ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Target version:
-
ruby -v:
ruby 3.1.0dev (2021-02-09T13:22:37Z master e7a831de8e) [x86_64-darwin20]
[ruby-core:102429]

Description

If the fiber scheduler doesn't define an unblock function, Ruby will segv when threads are joined.

Here is an example program:

class Scheduler
  def block blocker, timeout = nil
  end

  def fiber &block
    fiber = Fiber.new blocking: false, &block
    fiber.resume
    fiber
  end
end


Fiber.set_scheduler Scheduler.new

Fiber.schedule do
  Thread.new { }.join
end

The backtrace looks like this:

(lldb) bt
* thread #3, name = 'test.rb:17', stop reason = EXC_BAD_ACCESS (code=1, address=0xb0)
    frame #0: 0x00000001000dc49a miniruby`rb_ec_tag_jump(ec=0x0000000100a2ec50, st=RUBY_TAG_RAISE) at eval_intern.h:185:20
    frame #1: 0x00000001000dbda7 miniruby`rb_longjmp(ec=0x0000000100a2ec50, tag=6, mesg=0x000000010101b3f8, cause=0x0000000000000008) at eval.c:699:5
    frame #2: 0x00000001000dbb9c miniruby`rb_exc_raise(mesg=0x000000010101b3f8) at eval.c:717:5
    frame #3: 0x000000010037446c miniruby`raise_method_missing(ec=0x0000000100a2ec50, argc=3, argv=0x000070000e6d39e0, obj=0x000000010101b8d0, last_call_status=MISSING_MISSING) at vm_eval.c:955:2
    frame #4: 0x0000000100374288 miniruby`method_missing(ec=0x0000000100a2ec50, obj=0x000000010101b8d0, id=24721, argc=3, argv=0x000070000e6d39e0, call_status=MISSING_NOENTRY, kw_splat=0) at vm_eval.c:1002:5
    frame #5: 0x0000000100385fdd miniruby`rb_call0(ec=0x0000000100a2ec50, recv=0x000000010101b8d0, mid=24721, argc=2, argv=0x000070000e6d3be0, call_scope=CALL_FCALL, self=0x0000000000000008) at vm_eval.c:515:20
    frame #6: 0x0000000100358a02 miniruby`rb_funcallv_scope(recv=0x000000010101b8d0, mid=24721, argc=2, argv=0x000070000e6d3be0, scope=CALL_FCALL) at vm_eval.c:1021:16
    frame #7: 0x0000000100354c71 miniruby`rb_funcallv(recv=0x000000010101b8d0, mid=24721, argc=2, argv=0x000070000e6d3be0) at vm_eval.c:1038:12
    frame #8: 0x000000010035921d miniruby`rb_funcall(recv=0x000000010101b8d0, mid=24721, n=2) at vm_eval.c:1109:12
  * frame #9: 0x0000000100291d23 miniruby`rb_fiber_scheduler_unblock(scheduler=0x000000010101b8d0, blocker=0x000000010107bd70, fiber=0x000000010101b768) at scheduler.c:142:12
    frame #10: 0x00000001002f1445 miniruby`rb_threadptr_join_list_wakeup(thread=0x0000000100a2e9b0) at thread.c:555:13
    frame #11: 0x00000001002f0fd5 miniruby`thread_start_func_2(th=0x0000000100a2e9b0, stack_start=0x000070000e7d3f70) at thread.c:891:9
    frame #12: 0x00000001002f07b5 miniruby`thread_start_func_1(th_ptr=0x0000000100a2e9b0) at thread_pthread.c:1033:9
    frame #13: 0x00007fff2043a950 libsystem_pthread.dylib`_pthread_start + 224
    frame #14: 0x00007fff2043647b libsystem_pthread.dylib`thread_start + 15

It seems like the ec is missing a tag:

(lldb) f 0
frame #0: 0x00000001000dc49a miniruby`rb_ec_tag_jump(ec=0x0000000100a2ec50, st=RUBY_TAG_RAISE) at eval_intern.h:185:20
   182  static inline void
   183  rb_ec_tag_jump(const rb_execution_context_t *ec, enum ruby_tag_type st)
   184  {
-> 185      ec->tag->state = st;
   186      ruby_longjmp(ec->tag->buf, 1);
   187  }
   188  
(lldb) p ec->tag
(rb_vm_tag *const) $1 = 0x0000000000000000
(lldb) 

I tried popping the tag later in thread_start_func_2, but it caused the process to go in to an infinite loop.

Updated by alanwu (Alan Wu) 5 months ago

Just some observations in case it's useful. Implementing unblock in the scheduler and printing out the current thread shows that unblock runs on a dead thread:

class Scheduler
  def block blocker, timeout = nil
  end

  def unblock a, b
    p Thread.current
  end

  def fiber &block
    fiber = Fiber.new blocking: false, &block
    fiber.resume
    fiber
  end
end


Fiber.set_scheduler Scheduler.new

Fiber.schedule do
  Thread.new { }.join
end
ruby 3.1.0dev (2021-02-09T22:47:36Z master 49d3830f44) [x86_64-darwin19]
#<Thread:0x00007fee4d81b490 test.rb:20 dead>

It doesn't seem right to run Ruby code on a dead thread.

Also, raising any exception in the unblock method will cause a SEGV. For example:

class Scheduler
  def block blocker, timeout = nil
  end

  def unblock a, b
    raise
  end

  def fiber &block
    fiber = Fiber.new blocking: false, &block
    fiber.resume
    fiber
  end
end


Fiber.set_scheduler Scheduler.new

Fiber.schedule do
  Thread.new { }.join
end
Actions #2

Updated by ioquatix (Samuel Williams) 4 months ago

My initial reaction is a scheduler without unblock is broken by design, and it's the dead thread which is invoking unblock as part of it's tidy up - which in other cases will wake up other threads. I don't have any strong opinion about it, except that a thread that transitions to dead is then able to notify others that join can proceed.

Updated by ioquatix (Samuel Williams) about 1 month ago

I found the reason for this and I have made a PR which I think addresses this. I'll use this as a test case.

https://github.com/ruby/ruby/pull/4471

Updated by ioquatix (Samuel Williams) about 1 month ago

Okay, now rather than SEGV, I get unlimited number of

undefined method `unblock' for #<Scheduler:0x000000010a1b1fb0> (NoMethodError)

which I think is at least somewhat better. So I'll merge the PR.

Actions #5

Updated by jeremyevans0 (Jeremy Evans) 9 days ago

  • Status changed from Open to Closed
Actions

Also available in: Atom PDF