Project

General

Profile

Actions

Bug #21504

closed

[Ractor] Process.waitpid blocks ractor, new NT doesn't pick up other ractors

Bug #21504: [Ractor] Process.waitpid blocks ractor, new NT doesn't pick up other ractors

Added by luke-gru (Luke Gruber) 8 months ago. Updated 17 days ago.

Status:
Closed
Assignee:
Target version:
-
[ruby-core:122683]

Description

The following code hangs when run with RUBY_MAX_CPU=2 make run:

Note: RUBY_MAX_CPU is set to 2 so that only 1 non-main ractor can run at once.

test.rb:

rs = []
2.times do |i|
  rs << Ractor.new(i) do |i|
    if i == 0
      io = IO.popen("ruby -e 'sleep'")
      Process.wait(io.pid) # block forever
    else
      sleep 1 # make sure first ractor blocks forever first
      $stderr.puts "Running r #{i}"
      100_000.times do
        [nil] * 1_000
      end
      $stderr.puts "done r #{i}"
    end
  end
end

while rs.size == 2
  r, obj = Ractor.select(*rs)
  rs.delete(r)
end

The timer thread should create a new NT to compensate for the dedicated task, and the new NT should be able to pick up the other runnable ractor.

In contrast, the following works fine:

rs = []
2.times do |i|
  rs << Ractor.new(i) do |i|
    if i == 0
      r, w = IO.pipe
      r.read(1) # block forever
    else
      sleep 1 # make sure first ractor blocks forever first
      $stderr.puts "Running r #{i}"
      100_000.times do
        [nil] * 1_000
      end
      $stderr.puts "done r #{i}"
    end
  end
end

while rs.size == 2
  r, obj = Ractor.select(*rs)
  rs.delete(r)
end

Updated by jhawthorn (John Hawthorn) 8 months ago Actions #1 [ruby-core:122684]

  • Assignee set to ractor

Updated by luke-gru (Luke Gruber) 7 months ago Actions #2 [ruby-core:122821]

This is due to IO (ex: IO#read) registering wait events with the timer thread. When it does this, it wakes the timer thread up. This makes the timer thread check if any new NT needs to be created, and creates one if necessary. A quick fix would be to wake the timer thread when calling Process#waitpid, but there could be other methods affected as well.

Updated by luke-gru (Luke Gruber) 7 months ago Actions #3 [ruby-core:122832]

This is actually a more pervasive problem than I first realized, because only sometimes does IO#read register with the timer thread. With a plain IO#read without arguments, it does not wake the timer thread and can block the nt forever as well.

Updated by Anonymous 17 days ago Actions #4

  • Status changed from Open to Closed

Applied in changeset git|7b3370a5579956404d742a2e104d72e7c89480e4.


Wake timer to create new SNT when needed for dedicated task (#16009)

When removing a thread from running_threads, if we're on a shared
native thread and we're running a dedicated task, we need to wake
the timer thread so it can create a new SNT if necessary. We only
do this if it's waiting forever without the 10ms quantum timeout
for now, because max 10ms of wait is considered "good enough".
In the future, perhaps we can force the timer thread to wake if this
becomes an issue (timer_thread_wakeup_force).

Fixes [Bug #21504]

Actions

Also available in: PDF Atom