Bug #21840: Locking a mutex can lead to starvation - Ruby - Ruby Issue Tracking System

Actions

Copy link

Bug #21840

open

Locking a mutex can lead to starvation

Bug #21840: Locking a mutex can lead to starvation

Added by luke-gru (Luke Gruber) about 2 months ago. Updated about 2 months ago.

Status:

Open

Assignee:

luke-gru (Luke Gruber)

Target version:

4.1

ruby -v:

4.1.0dev

Backport:

3.2: UNKNOWN, 3.3: UNKNOWN, 3.4: REQUIRED, 4.0: REQUIRED

[ruby-core:124563]

Description

Continually locking a mutex m can lead to starvation if all other threads are on the waitq of m.

Let T be the thread that keeps on acquiring mutex m in a loop.

Iteration 1:

T locks mutex m
All other threads attempt to acquire m and end up on its waitq
T releases the GVL, doesn't wake any other threads because they're all waiting on m
T returns from the blocking function, resets its running_time to 0, acquires the GVL and continues running
T unlocks m. It adds the head of the waitq (T2) to the thread readyq. T continues running

Iteration 2:

T locks mutex m
T calls a blocking function. This time, it dequeues the readyq (ex: T2) and sends it the wakeup signal. T runs its blocking function
T2 wakes up, acquires the GVL and attempts to lock mutex m. It fails and goes back asleep, putting itself back on the waitq of m.
T returns from the blocking function, sets its running_time back to 0, acquires the GVL and keeps running
T unlocks m. It adds the head of the waitq (ex: T3) to the end of the thread readyq. T continues running. Repeat Iteration 2 except the head of readyq is now T3

The problem is that T can never be pre-empted.

Example script:

m = Mutex.new

def fib(n)
  return n if n <= 1
  fib(n - 1) + fib(n - 2)
end

t1_running = false
t1 = Thread.new do
  t1_running = true
  loop do
    fib(20)
    m.synchronize do
      $stderr.puts "t1 iter"
    end
  end
end

loop until t1_running

5.times.map do
  Thread.new do
    m.synchronize do
    end
  end
end.each(&:join)

Updated by luke-gru (Luke Gruber) about 2 months ago Actions
Copy link
#1

Target version changed from 4.0 to 4.1
ruby -v set to 4.1.0dev
Backport changed from 3.2: UNKNOWN, 3.3: UNKNOWN, 3.4: UNKNOWN, 4.0: UNKNOWN to 3.2: UNKNOWN, 3.3: UNKNOWN, 3.4: REQUIRED, 4.0: REQUIRED

Updated by Anonymous about 2 months ago Actions
Copy link
#2

Status changed from Open to Closed

Applied in changeset git|994257ab06072df38de024e70a60aa9a87e36089.

Prevent starvation when acquiring mutex over and over (#15877)

Continually locking a mutex m can lead to starvation if all other threads are on the waitq of m.

See https://bugs.ruby-lang.org/issues/21840 for more details.

Solution:

When a thread T1 wakes up T2 during mutex unlock but T1 or any other thread successfully acquires it
before T2, then we record the running_time of the thread during mutex acquisition. Then during unlock, if
that thread's running_time is less than the saved running time, we set it back to the saved time.

Fixes [Bug #21840]

Updated by ivoanjo (Ivo Anjo) about 2 months ago Actions
Copy link
#3 [ruby-core:124639]

Really cool! I wonder if this will end up solving https://bugs.ruby-lang.org/issues/19717 as well / https://github.com/ruby/ruby/pull/8331 is an old PR for it.

Updated by luke-gru (Luke Gruber) about 2 months ago · Edited Actions
Copy link
#4 [ruby-core:124640]

Status changed from Closed to Open

I wasn't aware of those old issues. I'll take a look, thanks!

I reverted the commit because of issues with a Monitor test in CI. I thought it was related to this change, but the test kept failing even after the revert. It turns out it was a badly designed test and I've fixed it. I plan on recommitting this fix, but I may change the implementation.

Actions

Copy link

Also available in: PDF Atom

Project

General

Profile

Ruby

Custom queries

Bug #21840

Locking a mutex can lead to starvation

Updated by luke-gru (Luke Gruber) about 2 months ago Actions
Copy link
#1

Updated by Anonymous about 2 months ago Actions
Copy link
#2

Updated by ivoanjo (Ivo Anjo) about 2 months ago Actions
Copy link
#3 [ruby-core:124639]

Updated by luke-gru (Luke Gruber) about 2 months ago · Edited Actions
Copy link
#4 [ruby-core:124640]

Project

General

Profile

Ruby

Custom queries

Bug #21840

Locking a mutex can lead to starvation

Updated by luke-gru (Luke Gruber) about 2 months ago ActionsCopy link #1

Updated by Anonymous about 2 months ago ActionsCopy link #2

Updated by ivoanjo (Ivo Anjo) about 2 months ago ActionsCopy link #3 [ruby-core:124639]

Updated by luke-gru (Luke Gruber) about 2 months ago · Edited ActionsCopy link #4 [ruby-core:124640]

Updated by luke-gru (Luke Gruber) about 2 months ago Actions
Copy link
#1

Updated by Anonymous about 2 months ago Actions
Copy link
#2

Updated by ivoanjo (Ivo Anjo) about 2 months ago Actions
Copy link
#3 [ruby-core:124639]

Updated by luke-gru (Luke Gruber) about 2 months ago · Edited Actions
Copy link
#4 [ruby-core:124640]