Bug #19837
closedConcurrent calls to Process.waitpid2 misbehave on Ruby 3.1 & 3.2
Description
On Ruby 3.1 & 3.2, if you have one thread blocked into a directed call to Process.waitpid2 with a pid specified, a concurrent call to Process.waitpid2 -1 will not be able to find & reap any other terminated child process, even one with a different pid that is not individually being waited on.
I've attached a Ruby program which should terminate but doesn't as a result of this bug, as well as a C program which demonstrates that the underlying syscalls (at least on Linux) do behave how you would expect. My reproduction creates two processes; a long-running process that does not exit, and a short one which does. There is a background thread calling Process.waitpid2 on the long process. Then, a concurrent call to Process.waitpid2 -1 does not notice that the short-running process has exited.
My wait_bug.rb program does work properly and terminate on the current master branch of Ruby; I assume this is because all of the MJIT-related process management stuff with the waiting_pids & stuff has been cleaned up as part of the MJIT -> RJIT refactoring. Because of this, I'm not sure exactly how to make a patch; should I open a pair of PRs targeting the ruby_3_2 and ruby_3_1 branch?
Thanks!
Files
        
           Updated by jeremyevans0 (Jeremy Evans) about 2 years ago
          Updated by jeremyevans0 (Jeremy Evans) about 2 years ago
          
          
        
        
      
      @kjtsanaktsidis (KJ Tsanaktsidis) Yes, please prepare backport pull requests for ruby_3_2 and ruby_3_1, then we can update the backport flags in this issue to let the branch maintainer know it is ready for backporting.
        
           Updated by kjtsanaktsidis (KJ Tsanaktsidis) about 2 years ago
          Updated by kjtsanaktsidis (KJ Tsanaktsidis) about 2 years ago
          
          
        
        
      
      OK - I hope I've done this right, let me know otherwise :)
- Ruby 3.0: https://github.com/ruby/ruby/pull/8248
- Ruby 3.1: https://github.com/ruby/ruby/pull/8246
- Ruby 3.2: https://github.com/ruby/ruby/pull/8247
Also, I opened a PR on the main branch which adds just the test I wrote for this issue - the existing implementation passes it though so there are no other changes needed: https://github.com/ruby/ruby/pull/8245
        
           Updated by jeremyevans0 (Jeremy Evans) about 2 years ago
          Updated by jeremyevans0 (Jeremy Evans) about 2 years ago
          
          
        
        
      
      - Status changed from Open to Closed
- Backport changed from 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN to 3.0: UNKNOWN, 3.1: REQUIRED, 3.2: REQUIRED
This updates the backport flags. Note that Ruby 3.0 is in security maintenance, and this doesn't appear to be security related, so it is unlikely to be backported to Ruby 3.0.
        
           Updated by kjtsanaktsidis (KJ Tsanaktsidis) about 2 years ago
          Updated by kjtsanaktsidis (KJ Tsanaktsidis) about 2 years ago
          
          
        
        
      
      Thanks - yeah it’s no problem if it doesn’t get back ported to 3.0.
        
           Updated by usa (Usaku NAKAMURA) about 2 years ago
          Updated by usa (Usaku NAKAMURA) about 2 years ago
          
          
        
        
      
      - Backport changed from 3.0: UNKNOWN, 3.1: REQUIRED, 3.2: REQUIRED to 3.0: UNKNOWN, 3.1: DONE, 3.2: REQUIRED
merged PR#8246 into ruby_3_1
thx!
        
           Updated by nagachika (Tomoyuki Chikanaga) about 2 years ago
          Updated by nagachika (Tomoyuki Chikanaga) about 2 years ago
          
          
        
        
      
      - Backport changed from 3.0: UNKNOWN, 3.1: DONE, 3.2: REQUIRED to 3.0: UNKNOWN, 3.1: DONE, 3.2: DONE
Merged into ruby_3_2 branch at 0b7a4fbaa9c56d2c67d00d86c69f9e5c71803267.
Thank you!
        
           Updated by byroot (Jean Boussier) almost 2 years ago
          Updated by byroot (Jean Boussier) almost 2 years ago
          
          
        
        
      
      - Related to Bug #20181: Process.wait(-1) doesn't report exited child processes if WAITPID_USE_SIGCHLD is enabled added