Bug #18036


Pthread fibers become invalid on fork - different from normal fibers.

Added by ioquatix (Samuel Williams) almost 3 years ago. Updated 8 months ago.

Target version:


Fork is notoriously hard to use correctly and I most cases we should be encouraging Process#spawn. However, it does have use cases for example pre-fork model of server design. So there are some valid usage at least.

We recently introduced non-native fiber based on pthread which is generally more compatible than copy coroutine w.r.t. the overall burden on the implementation. However, it has one weak point, which is that pthreads become invalid on fork, and thus fibers become invalid on fork. That means that the following program can become invalid: do

It will create two threads, the main thread and the thread for the fiber. When the child begins execution, it will be within the child pthread, but the parent pthread is no longer valid, i.e. it's gone.

I see a couple of options here (not mutually exclusive):

  • Combining Fibers and fork is invalid. Fork only works from main fiber.
  • Ignore the problem and expect users of fork to be aware that the program can potentially enter an invalid state - okay for fork-exec but not much else.
  • Terminate all non-current fibers as we do for threads, and possibly fail if the current fiber exits for some reason.

Because pthread coroutine should be very uncommon, I don't think we should sacrifice the general good qualities of the fiber semantic model for some obscure case. Maybe it would be sufficient to have a warning (not printed by default unless running on pthread coroutines), that fork within a non-main fiber can have undefined results.

Updated by jeremyevans0 (Jeremy Evans) almost 3 years ago

I'm fine having fork raise an exception when called from non-main fiber if the pthread coroutine implementation is used, and issue a verbose warning in other cases.

Updated by matz (Yukihiro Matsumoto) over 2 years ago

Using fork from within non-main fibers should be prohibited only on architectures where it is unreliable (e.g. OpenBSD).


Updated by ioquatix (Samuel Williams) over 2 years ago

Thanks @matz (Yukihiro Matsumoto). @ko1 (Koichi Sasada), on my PC, the following program segfaults. I tried it on 2.7 and 3 (head).

require 'fiber'

fiber = do
  while true
    puts "Hello World"

puts "Exiting"

I'm not sure, do we expect this to work? It seems child process crash. But the program seems well-formed.

Hello World
[BUG] Segmentation fault at 0x0000000000000040
ruby 2.7.3p183 (2021-04-05 revision 6847ee089d) [x86_64-linux]

-- Control frame information -----------------------------------------------
c:0001 p:---- s:0003 e:000002 (none) [FINISH]

-- Machine register context ------------------------------------------------
 RIP: 0x0000560912d70169 RBP: 0x00005609149f05d0 RSP: 0x00007f972fa09d60
 RAX: 0x0000000000000000 RBX: 0x000056091484edb0 RCX: 0x00005609149f05d0
 RDX: 0x0000000000000010 RDI: 0x00005609149f05d0 RSI: 0x0000000000000006
  R8: 0x0000000000000000  R9: 0x0000000000000000 R10: 0x00005609149d7630
 R11: 0x0000000000000000 R12: 0x0000000000000006 R13: 0x000056091484ee50
 R14: 0x0000000000000000 R15: 0x00000000ffffffff EFL: 0x0000000000010202

-- C level backtrace information -------------------------------------------
ruby(0x2143a8) [0x560912f573a8]
/home/samuel/.rubies/ruby-2.7.3/bin/ruby(rb_bug_for_fatal_signal+0xe4) [0x560912ffd974]
/home/samuel/.rubies/ruby-2.7.3/bin/ruby(sigsegv+0x49) [0x560912eb52f9]
/home/samuel/.rubies/ruby-2.7.3/bin/ruby(rb_ec_tag_jump+0x9) [0x560912d70169]
/home/samuel/.rubies/ruby-2.7.3/bin/ruby(rb_longjmp+0x82) [0x560912d71fd2]
/home/samuel/.rubies/ruby-2.7.3/bin/ruby(rb_print_undef+0x40) [0x560912d74790]
/home/samuel/.rubies/ruby-2.7.3/bin/ruby(rb_vraise+0x4c) [0x560912fff01c]
/home/samuel/.rubies/ruby-2.7.3/bin/ruby(rb_raise+0x94) [0x560912fff0b4]
/home/samuel/.rubies/ruby-2.7.3/bin/ruby(return_fiber.part.0+0x19) [0x560912fce3d9]
/home/samuel/.rubies/ruby-2.7.3/bin/ruby(rb_fiber_start+0x2f1) [0x560912fd11e1]
/home/samuel/.rubies/ruby-2.7.3/bin/ruby(fiber_entry+0x9) [0x560912fd12c9]

Updated by jeremyevans0 (Jeremy Evans) 8 months ago

  • Status changed from Open to Closed

I tested the example in in my environment, and since Ruby 3.0, it doesn't fail (it does segfault on Ruby 2.7). I tested with Ruby master and the behavior is the same with both the amd64 coroutine and pthread coroutine, which is that Fiber.yield raises an FiberError in the forked process ("attempt to yield on a not resumed fiber").

Since the behavior doesn't appear to be unreliable, I'm going to close this.


Also available in: Atom PDF