Project

General

Profile

Bug #5463

PTY or IO.select timing issue results in no EOF

Added by thinkerbot (Simon Chiang) about 9 years ago. Updated over 6 years ago.

Status:
Rejected
Priority:
Normal
Target version:
-
ruby -v:
1.9.2p290
Backport:
[ruby-core:40226]

Description

I have observed that when running a shell through PTY the slave will sometimes fail to produce an EOF after an exit command. As a result polling via IO.select can timeout. A full example is attached. This is a simplified example illustrating the problematic loop:

# PTY.spawn ...
master.write "exit 8\n"

str = ''
while true
unless IO.select([slave],nil,nil,3)
raise "timeout waiting for slave EOF"
end

if slave.eof?
  break
end

str << slave.read(1)

end

After 'exit' is written to master, the loop normally reads all of slave into str. The select ensures the loop can timeout but under normal circumstances it will not (3 seconds is plenty of time to exit a shell). The bug is that it occasionally does timeout having never seen an EOF - meaning either the select is not detecting EOF on the slave or an EOF is not being written to the slave.

The bizarre thing is that I can confirm after the timeout that the pty process does exit with the correct status (8) regardless of whether the loop exits normally with an EOF or by timeout.

I'm not sure if this is an issue with the PTY implementation, an issue with the shell, or with the OS. I have observed the bug repeatedly using 1.9.2 on OS X 10.6.8, Ubuntu 11.04, and SLES 10, and with shells bash, ksh, csh, zsh (although mostly with bash). I suspect the PTY implementation plays some role because the bug does not appear to occur on 1.8.7 and 1.8.6. However the frequency of the bug varies so much across OS and shell, I know it could very well be an issue outside of ruby.

To reproduce, run the pty_fail.rb script for 10k (or more) iterations. On OS X it usually crops up within 10k. On Ubuntu 11.04 it is very, very rare, ~100k may be needed. Ex:

ruby pty_no_eof_example.rb 10000 /bin/bash


Files

pty_fail.rb (2.97 KB) pty_fail.rb thinkerbot (Simon Chiang), 10/19/2011 12:02 PM
pty_fail.rb (2.29 KB) pty_fail.rb thinkerbot (Simon Chiang), 10/19/2011 12:04 PM

Updated by thinkerbot (Simon Chiang) about 9 years ago

Oops! I uploaded the wrong file. Apologies, here is the correct file (the 2.3 kB one).

Updated by ko1 (Koichi Sasada) over 8 years ago

  • Category set to ext
  • Assignee set to akr (Akira Tanaka)
#3

Updated by shyouhei (Shyouhei Urabe) over 8 years ago

  • Status changed from Open to Assigned

Updated by akr (Akira Tanaka) over 7 years ago

  • Status changed from Assigned to Feedback

I tried to reproduce the problem on Debian GNU/Linux and FreeBSD. (I don't have Mac OS X.)
It is possible but very rare.

However the problem is occur more frequently if I run different heavy task on the host.

So, I guess the problem is just a "3 seconds is not enough".
It is possible that OS runs the child process very slowly if the host is very busy.

Is there an evidence that this problem is actually the problem of Ruby?

Updated by akr (Akira Tanaka) over 6 years ago

  • Status changed from Feedback to Rejected

I think this is not a problem of Ruby.

Also available in: Atom PDF