Project

General

Profile

Actions

Bug #21959

closed

rb_internal_thread_event_hooks_rw_lock is not reinitialized after fork causing deadlocks

Bug #21959: rb_internal_thread_event_hooks_rw_lock is not reinitialized after fork causing deadlocks

Added by anmarchenko_datadog (Andrey Marchenko) 7 days ago. Updated 3 days ago.

Status:
Closed
Assignee:
-
Target version:
-
[ruby-core:125078]

Description

Ruby's GVL Instrumentation API uses a read-write lock (rb_internal_thread_event_hooks_rw_lock) to protect the list of thread event hooks:

  • Read lock — acquired on every GVL transition to iterate and call hook callbacks (rb_thread_execute_hooks)
  • Write lock — acquired when adding/removing hooks (rb_internal_thread_add_event_hook, rb_internal_thread_remove_event_hook)

After fork(), Ruby reinitializes several internal locks (e.g. vm->ractor.sched.lock, timer_th.waiting_lock), but not rb_internal_thread_event_hooks_rw_lock. This wasn't added with the GVL Instrumentation API.

The full reproducer is available here: https://github.com/anmarchenko/ruby-locks-fork-bug

Deadlock sequence

  1. Parent process has thread event hooks registered (e.g. by a profiler like dd-trace-rb)
  2. Multiple threads run concurrently, causing GVL transitions — each transition acquires a read lock on the rwlock
  3. fork() happens while a thread holds the read lock
  4. In the child, only the forking thread survives — the thread that held the lock is gone, but the lock state is copied as-is
  5. Child tries to add or remove a hook → needs write lock → blocks forever on a lock that will never be released
  6. Deadlock

Impact

This affects any Ruby C extension using the GVL Instrumentation API in combination with fork-based servers (Resque, Unicorn, Passenger, etc.). The original report comes from dd-trace-rb's profiler deadlocking Resque workers on Alpine Linux (musl libc): https://github.com/DataDog/dd-trace-rb/issues/4967

Updated by anmarchenko_datadog (Andrey Marchenko) 7 days ago Actions #1 [ruby-core:125079]

I am planning to open a PR with proposed fix soon

Updated by byroot (Jean Boussier) 6 days ago Actions #3

  • Backport changed from 3.2: UNKNOWN, 3.3: UNKNOWN, 3.4: UNKNOWN, 4.0: UNKNOWN to 3.2: WONTFIX, 3.3: REQUIRED, 3.4: REQUIRED, 4.0: REQUIRED

Updated by anmarchenko_datadog (Andrey Marchenko) 6 days ago 1Actions #4

  • Status changed from Open to Closed

Applied in changeset git|c8155822c460a5734d700cd468d306ca03b44ce4.


reinit rb_internal_thread_event_hooks_rw_lock at fork

[Bug #21959]

Updated by hsbt (Hiroshi SHIBATA) 3 days ago · Edited 1Actions #5 [ruby-core:125105]

  • Backport changed from 3.2: WONTFIX, 3.3: REQUIRED, 3.4: REQUIRED, 4.0: REQUIRED to 3.2: WONTFIX, 3.3: DONE, 3.4: REQUIRED, 4.0: REQUIRED
Actions

Also available in: PDF Atom