Project

General

Profile

Actions

Bug #11922

closed

[PATCH] fix ASYNC BUG race from bootstraptest/test_fork.rb

Added by normalperson (Eric Wong) over 8 years ago. Updated about 8 years ago.

Status:
Closed
Assignee:
-
Target version:
-
[ruby-core:72590]

Description

thread_pthread.c (rb_thread_create_timer_thread): fix race

This fixes an occasional [ASYNC BUG] failure in
bootstraptest/test_fork.rb '[ruby-dev:37934]'
which tests fork/pthread_create failure by setting
RLIMIT_NPROC to 1 and triggering EAGAIN on pthread_create
when attempting to recreate the timer thread.

The problem timeline is as follows:

thread 1                           thread 2
---------------------------------------------------------------
rb_thread_create_timer_thread
setup_communication_pipe
                                   rb_thread_wakeup_timer_thread_low
pthread_create fails               pipe looks valid, write!
CLOSE_INVALIDATE (x4)              EBADF -> ASYNC BUG

The checks in rb_thread_wakeup_timer_thread_low only tried to
guarantee proper ordering with native_stop_timer_thread, not
rb_thread_create_timer_thread :x

Now, this should allow rb_thread_create_timer_thread to
synchronize properly with rb_thread_wakeup_timer_thread_low by
delaying the validation marking of the timer_thread_pipe until
we are certain the timer thread is alive.

In this version, rb_thread_wakeup_timer_thread_low becomes a
noop.  Threading is still completely broken with NPROC==1, but
there's not much we can do about it beside warn the user.
We no longer spew a scary [ASYNC BUG] message at them and
dump core on them.

Note: testing this overnight with the [ruby-dev:37934] bit extracted
from bootstraptest/test_fork.rb

	  main = Thread.current
	  Thread.new { sleep 0.01 until main.stop?; Thread.kill main }
	  Process.setrlimit(:NPROC, 1)
	  fork {}

This bug seems easier to reproduce on my weak VM with 32-bit luserspace
(64-bit kernel) VM than more powerful machines.  Even without this
patch, it could take hours to reproduce the race.  I haven't been able
to reproduce this bug at all on my Phenom II machine.

Way too tired to be committing this right now...

Files

Actions #1

Updated by Anonymous over 8 years ago

  • Status changed from Open to Closed

Applied in changeset r53373.


thread_pthread.c (rb_thread_create_timer_thread): fix race

This fixes an occasional [ASYNC BUG] failure in
bootstraptest/test_fork.rb '[ruby-dev:37934]'
which tests fork/pthread_create failure by setting
RLIMIT_NPROC to 1 and triggering EAGAIN on pthread_create
when attempting to recreate the timer thread.

The problem timeline is as follows:

thread 1 thread 2

rb_thread_create_timer_thread
setup_communication_pipe
rb_thread_wakeup_timer_thread_low
pthread_create fails pipe looks valid, write!
CLOSE_INVALIDATE (x4) EBADF -> ASYNC BUG

The checks in rb_thread_wakeup_timer_thread_low only tried to
guarantee proper ordering with native_stop_timer_thread, not
rb_thread_create_timer_thread :x

Now, this should allow rb_thread_create_timer_thread to
synchronize properly with rb_thread_wakeup_timer_thread_low by
delaying the validation marking of the timer_thread_pipe until
we are certain the timer thread is alive.

In this version, rb_thread_wakeup_timer_thread_low becomes a
noop. Threading is still completely broken with NPROC==1, but
there's not much we can do about it beside warn the user.
We no longer spew a scary [ASYNC BUG] message or dump core
on them.

  • thread_pthread.c (setup_communication_pipe): delay setting owner
    (rb_thread_create_timer_thread): until thread creation succeeds
    [ruby-core:72590] [Bug #11922]

Updated by naruse (Yui NARUSE) about 8 years ago

  • Backport changed from 2.0.0: UNKNOWN, 2.1: UNKNOWN, 2.2: UNKNOWN, 2.3: REQUIRED to 2.0.0: UNKNOWN, 2.1: UNKNOWN, 2.2: UNKNOWN, 2.3: DONE

ruby_2_3 r54426 merged revision(s) 53373.

Actions

Also available in: Atom PDF

Like0
Like0Like0