Bug #14939: [PATCH] cont.c (ec_switch): prevent delayed/missed trap interrupt race - Ruby - Ruby Issue Tracking System

Actions

Copy link

Bug #14939

closed

[PATCH] cont.c (ec_switch): prevent delayed/missed trap interrupt race

Added by normalperson (Eric Wong) about 7 years ago. Updated almost 7 years ago.

Status:

Closed

Assignee:

ko1 (Koichi Sasada)

Target version:

ruby -v:

Backport:

2.3: UNKNOWN, 2.4: UNKNOWN, 2.5: DONE

[ruby-core:88110]

Description

ko1: I noticed this while working on timer-thread elimination/lazy-spawning.
However, it looks like a bug we introduced in 2.5 with `ec'.
Can you check this patch? Thanks.

cont.c (ec_switch): prevent delayed/missed trap interrupt race

timer-thread may set trap interrupt with rb_threadptr_check_signal
at any time independent of GVL. This means timer-thread may set
the trap interrupt flag on the previous execution context; causing
the flag to be unnoticed until a future ec switch (or lost
completely if the ec is done).

Note: I avoid relying on th->interrupt_lock here and use
atomics because we won't be able to rely on it for proposed lazy
timer-thread [Misc #14937].

This regression affects Ruby 2.5 as it was introduced by moving
interrupt_flag to `ec' which is an unstable pointer. Ruby <= 2.4
was unaffected because vm->main_thread->interrupt_flag never
changed.

Files

0001-cont.c-ec_switch-prevent-delayed-missed-trap-interru.patch (1.54 KB) 0001-cont.c-ec_switch-prevent-delayed-missed-trap-interru.patch

normalperson (Eric Wong), 07/25/2018 09:37 PM

Actions

Copy link

#1 [ruby-core:88119]

Updated by ko1 (Koichi Sasada) about 7 years ago

Good point. Seems good. Thank you!

BTW, I forget that why the trap handler is limited to main thread.

trap(:INT) do
  p Thread.current
end

Thread.new{
  sleep 1
  Process.kill :INT, $$
  p [:child, Thread.current]
}.join
sleep 10
p [:main, Thread.current]

__END__

[:child, #<Thread:0x00000125801a3028@t.rb:5 run>]
#<Thread:0x00000125801e9c80 run>
# after 10 seconds
[:main, #<Thread:0x00000125801e9c80 run>]

Can we relax this limitation?

Actions

Copy link

#2 [ruby-core:88120]

Updated by ko1 (Koichi Sasada) about 7 years ago

Oops, sorry forget about sleep lines.

Actions

Copy link

Updated by normalperson (Eric Wong) about 7 years ago

Status changed from Open to Closed

Applied in changeset trunk|r64062.

cont.c (ec_switch): prevent delayed/missed trap interrupt race

Note: I avoid relying on th->interrupt_lock here and use
atomics because we won't be able to rely on it for proposed lazy
timer-thread [Misc #14937].

This regression affects Ruby 2.5 as it was introduced by moving
interrupt_flag to `ec' which is an unstable pointer. Ruby <= 2.4
was unaffected because vm->main_thread->interrupt_flag never
changed.

[ruby-core:88119] [Bug #14939]

Actions

Copy link

#4 [ruby-core:88121]

Updated by normalperson (Eric Wong) about 7 years ago

ko1@atdot.net wrote:

Good point. Seems good. Thank you!

OK, commited as r64062

BTW, I forget that why the trap handler is limited to main thread.

I can think of at least three reasons:

Similar to this bug: other threads may exit soon,
main thread is long-lived so interrupt won't get lost
Typical trap handlers are established at program startup,
before other threads exist, so maybe some are reliant on
Thread.current[] values inside trap (perhaps unknowingly
through libraries).
Concurrency problem. Mutex isn't usable inside trap,
but there may be instances of reentrancy-safe code which
can't be made thread-safe.

Can we relax this limitation?

I don't think it's necessary; anybody who relies on trap will
likely structure main thread to deal with it (e.g. not block on
slow NFS IO#read :P). We can workaround 1), but I expect
compatibility and safety problems from 2) and 3).

Actions

Copy link

#5 [ruby-core:89374]

Updated by nagachika (Tomoyuki Chikanaga) almost 7 years ago

Backport changed from 2.3: UNKNOWN, 2.4: UNKNOWN, 2.5: REQUIRED to 2.3: UNKNOWN, 2.4: UNKNOWN, 2.5: DONE

ruby_2_5 r64999 merged revision(s) 64062.

Actions

Copy link

Also available in: Atom PDF

Like0

Like0Like0Like0Like0Like0

Project

General

Profile

Ruby

Tags

Custom queries

Bug #14939

[PATCH] cont.c (ec_switch): prevent delayed/missed trap interrupt race

Updated by ko1 (Koichi Sasada) about 7 years ago

Updated by ko1 (Koichi Sasada) about 7 years ago

Updated by normalperson (Eric Wong) about 7 years ago

Updated by normalperson (Eric Wong) about 7 years ago

Updated by nagachika (Tomoyuki Chikanaga) almost 7 years ago