Project

General

Profile

Bug #22120 » ar_find_entry_hint_race.rb

Eregon (Benoit Daloze), 06/18/2026 09:28 AM

 
# Reproducer for race condition in ar_find_entry_hint() (hash.c)
#
# The bug: RHASH_AR_TABLE_BOUND and the hints pointer are cached once,
# then ar_equal() calls #eql? which executes Ruby code and can yield
# the GVL. Another thread can then mutate the hash.
#
# The AR table and ST table live at the same address in the hash object
# (sizeof(struct RHash) offset). AR->ST conversion overwrites the hint
# bytes and pairs array with st_table fields (pointers, counters).
# A concurrent ar_find_entry_hint reader interprets those as VALUEs.
#
# Strategy: use a flag + sleep to synchronize the interleaving.
# 1. Reader enters ar_find_entry_hint, iterates entries, calls #eql?
# 2. Inside #eql?, reader sets a flag and sleeps (releases GVL)
# 3. Writer sees the flag, mutates the hash (shift + mass insert -> AR->ST)
# 4. Writer sets done flag
# 5. Reader wakes up, continues the ar_find_entry_hint loop on corrupted data

$in_eql = false

class SlowKey
attr_reader :v

def initialize(v, flag = nil)
@v = v
@flag = flag
end

def hash
0
end

def eql?(other)
if @flag
$in_eql = true
# Sleep releases the GVL, giving the writer thread time to mutate.
# We sleep long enough for the writer to finish all mutations.
sleep 0.01
end
other.is_a?(SlowKey) && @v == other.v
end

def inspect = "K(#{@v})"
end

puts "Racing ar_find_entry_hint() — expect a crash (segfault/bus error)."
puts "pid=#{$$}"
puts

Thread.abort_on_exception = true
n = 0

loop do
n += 1
p n

lookup_key = SlowKey.new(-1, true)

h = {}
7.times { |i| h[SlowKey.new(i)] = i }

# Writer: spins until reader is inside eql?, then mutates.
writer = Thread.new do
# Spin-wait for reader to enter eql? and release the GVL via sleep
Thread.pass until $in_eql

# Now the reader is sleeping inside eql?, holding a stale bound
# and hints pointer from ar_find_entry_hint.

# Remove all SlowKey entries
7.times { h.shift }

# Add enough integer entries to exceed AR capacity (8).
# This triggers ar_force_convert_table which overwrites
# the ar_table memory with an st_table struct.
12.times { |i| h[i] = i }
end

reader = Thread.new do
h[lookup_key]
rescue => e
puts "Reader exception: #{e.class}: #{e}"
end

writer.join
reader.join

if n % 1000 == 0
$stderr.print "\r#{n} iterations..."
end
end
    (1-1/1)