Bug #19917
closedSegmentation fault or lost objects when using Ractor.select with moved exceptions
Description
I stumbled upon loss of messages with exceptions or even Segmentation fault during transfer of exception from Ractor when moving the exception. In versions 3.0 and 3.1 I saw only loss of messages, in 3.2 also Segmentation fault. Loss of messages happens only when using Ractor.select
(not Ractor#take
), but segmentation fault happens also using Ractor#take
.
100.times do
ractor_count = 10
ractors = Array.new(ractor_count) do |i|
Ractor.new do
begin
raise 'foo'
rescue => e
# It is not possible to move exception without duplicating, but also
# without it showing bactrace errors in 3.0 and is empty in 3.1+
e = Marshal.load(Marshal.dump(e))
Ractor.yield e, move: true
end
'message got lost'
end
end
ractor_count.times do
ractor, result = Ractor.select(*ractors)
p result
ractors.delete(ractor)
end
end
In 3.0 and 3.1 will contain mix of #<RuntimeError: foo>
and "message got lost"
, in 3.2 may contain both and cause Segmentation fault. Excerpt of output (full output attached) from container running ruby:3.2 (ruby 3.2.2 (2023-03-30 revision e51014f9c0) [x86_64-linux]
):
…
#<RuntimeError: RuntimeError>
"message got lost"
#<RuntimeError: RuntimeError>
#<RuntimeError: RuntimeError>
#<RuntimeError: RuntimeError>
#<RuntimeError: RuntimeError>
#<RuntimeError: RuntimeError>
#<RuntimeError: RuntimeError>
#<RuntimeError: RuntimeError>
#<RuntimeError: RuntimeError>
#<RuntimeError: RuntimeError>
#<RuntimeError: RuntimeError>
#<RuntimeError: RuntimeError>
#<RuntimeError: RuntimeError>
#<RuntimeError: RuntimeError>
#<RuntimeError: RuntimeError>
#<RuntimeError: RuntimeError>
#<RuntimeError: RuntimeError>
#<RuntimeError: RuntimeError>
#<RuntimeError: RuntimeError>
<internal:marshal>:34: [BUG] Segmentation fault at 0x0000000000000000
ruby 3.2.2 (2023-03-30 revision e51014f9c0) [x86_64-linux]
-- Control frame information -----------------------------------------------
c:0004 p:0006 s:0019 e:000018 METHOD <internal:marshal>:34
c:0003 p:0022 s:0011 e:000010 RESCUE (irb):11
c:0002 p:0007 s:0007 e:000006 BLOCK (irb):6 [FINISH]
c:0001 p:---- s:0003 e:000002 DUMMY [FINISH]
-- Ruby level backtrace information ----------------------------------------
(irb):6:in `block (3 levels) in <top (required)>'
(irb):11:in `rescue in block (3 levels) in <top (required)>'
<internal:marshal>:34:in `load'
-- Machine register context ------------------------------------------------
RIP: 0x00007fabc1d64490 RBP: 0x0000000000000000 RSP: 0x00007fabbc71b0c0
RAX: 0x0000000000000003 RBX: 0x00007fabbcebee88 RCX: 0x000000000000015f
RDX: 0x0000556127db90f0 RDI: 0x000000000000015f RSI: 0x00007fabc2068628
R8: 0x0000000000000001 R9: 0x0000000000000020 R10: 0x00005561285633d0
R11: 0x0000000000000001 R12: 0x0000556127dbb710 R13: 0x00007fabc0601be0
R14: 0x0000000000000018 R15: 0x0000000000000000 EFL: 0x0000000000010206
…
Files
Updated by luke-gru (Luke Gruber) about 1 year ago
I'm not in front of a machine that runs ruby right now, but I don't think this has to do with ractors. I was trying things out in the ruby playground and managed to crash ruby with this as well:
100.times do |j|
ractor_count = 10
ractors = Array.new(ractor_count) do |i|
Proc.new do
begin
raise 'foo'
rescue => e
err = Marshal.load(Marshal.dump(e))
end
#e
end
end
ractors.each(&:call)
end
It looks like it's something related to Marshal.
Updated by toy (Ivan Kuchin) about 1 year ago
I wasn't able to reproduce segmentation fault, but even if segmentation fault is caused by Marshal
, it doesn't explain lost messages (when Ractor.select
returns a ractor and "message got lost"
).
Updated by ko1 (Koichi Sasada) about 1 year ago
- Status changed from Open to Closed
Applied in changeset git|054f56fd3e5bf84e5443896fd1f4e439c2773c60.
moved object should not have a shape ID
fix [Bug #19917]