Project

General

Profile

Actions

Bug #21926

closed

Thread#value on popen3 wait thread hangs in finalizer

Bug #21926: Thread#value on popen3 wait thread hangs in finalizer

Added by stevecrozz (Stephen Crosby) 21 days ago. Updated 3 days ago.

Status:
Closed
Target version:
-
[ruby-core:124887]

Description

Calling Thread#value on an Open3.popen3 wait thread from a finalizer completes in Ruby 3.2 but hangs in Ruby 3.3+. See repro.rb below. When the Ruby process hangs in these conditions, it no longer responds to signals and it seems to be unable to run any other threads.

This affects the schmooze gem (and potentially other code using Open3.popen3 with finalizers), causing test suites to hang intermittently.

# repro.rb
require 'open3'

class ProcessWrapper
  def initialize
    @stdin, @stdout, @stderr, @wait_thread = Open3.popen3("cat")
    ObjectSpace.define_finalizer(self, self.class.make_finalizer(@stdin, @stdout, @stderr, @wait_thread))
  end
                                                       
  def self.make_finalizer(stdin, stdout, stderr, wait_thread)
    proc do
      stdin.close rescue nil
      stdout.close rescue nil
      stderr.close rescue nil
      wait_thread.value  # Hangs here in Ruby 3.3+
    end
  end
end
                  
100.times { ProcessWrapper.new } 
GC.stress = true
1000.times { Object.new } 
puts "done"

Environment

  • Linux x86_64
  • Tested on Ruby 3.2.7, 3.3.7, 3.4.8

Updated by luke-gru (Luke Gruber) 14 days ago ยท Edited Actions #1 [ruby-core:124936]

  • Backport changed from 3.2: UNKNOWN, 3.3: UNKNOWN, 3.4: UNKNOWN, 4.0: UNKNOWN to 3.2: UNKNOWN, 3.3: UNKNOWN, 3.4: REQUIRED, 4.0: REQUIRED

Thank you for the nice reproduction! We have a fix coming, and it will probably be backported to 3.3, 3.4 and 4.0.

The fix is at https://github.com/ruby/ruby/pull/16307.

Updated by luke-gru (Luke Gruber) 14 days ago Actions #2

  • Assignee set to luke-gru (Luke Gruber)

Updated by luke-gru (Luke Gruber) 10 days ago Actions #3

  • Backport changed from 3.2: UNKNOWN, 3.3: UNKNOWN, 3.4: REQUIRED, 4.0: REQUIRED to 3.2: UNKNOWN, 3.3: REQUIRED, 3.4: REQUIRED, 4.0: REQUIRED

Updated by Anonymous 10 days ago Actions #4

  • Status changed from Open to Closed

Applied in changeset git|08372635f7ec09f7115bd254246ebd637499651c.


Fix race condition right after ubf registration

Registering a ubf was considered problematic in some cases because it could
result in lock ordering inversions with the ubf function itself. I believe
this is the reason that in patch be1bbd5b7d, the ubf was registered outside of
the thread_sched_lock.

For example, thread_sched_to_waiting_until_wakeup() (native_sleep)
should register the ubf with the thread_sched_lock (TSL) taken. The ordering
is (TSL -> UBF lock). During the ubf call itself, since the UBF lock is
always acquired during the call, the ordering is (UBF lock -> TSL). This
inversion was avoided by registering the ubf outside the TSL.

This inversion is benign in this case, though. Since there can be no ubf on the
thread before the call to ubf_set, and this ubf is the only place where the
(UBF lock -> TSL lock) ordering is done, it is okay to register it with the TSL.
In fact registering the ubf outside the TSL can result in problamatic race conditions,
so we now register the ubf with the TSL lock.

This fixes a race condition whereby a newly registered ubf is called before the thing it's
protecting (the sleep) is actually run. The sleeping thread can never be interrupted if this
happens.

Fixes [Bug #21926]

Updated by luke-gru (Luke Gruber) 9 days ago Actions #5 [ruby-core:124974]

A note to whoever backports this (which could be me, I'm just waiting ~ 1 week and checking various CIs after the merge):

https://github.com/ruby/ruby/pull/16362 should be backported as well, they are really 1 fix in 2 separate commits.

Updated by k0kubun (Takashi Kokubun) 3 days ago Actions #6 [ruby-core:125019]

  • Backport changed from 3.2: UNKNOWN, 3.3: REQUIRED, 3.4: REQUIRED, 4.0: REQUIRED to 3.2: UNKNOWN, 3.3: REQUIRED, 3.4: REQUIRED, 4.0: DONE
Actions

Also available in: PDF Atom