Bug #20672
closedUNIXSocket.pair transmitting data between pids looks flaky
Description
I have code that uses UNIXSocket.pair and fork to send data between parent and child.
It seems to work fine for a small number of messages passed, but then fails with Errno::EBADF on the child pid writing to its socket received from the parent.
I attached a test driver which includes successful and failed test runs. This is boiled down logic from a larger app, where I first started seeing this issue. I believe I first saw this on Ruby 3.0, but now I am on 3.3.4.
This could be me, as in my code or my workstation env, but I have not been able to prove that from many web searches. If you look at my test driver and see something wrong with it, I will take the shame and learn something. BUT just in case this is an issue in low level Ruby code, I am submitting this here.
Files
Updated by byroot (Jean Boussier) 3 months ago
I'm able to reproduce with large enough message (e.g. -c 4000
on my machine).
However if I change:
received_w = UNIXSocket.for_fd(main_c.recv_io.fileno)
with:
received_w = main_c.recv_io
received_w.sync = true
It fixes the problem. I'm not yet clear on what is causing this, but I suspect some metadata is lost when you use for_fd
. But I think it should work, so I'll try to figure out what exactly.
Updated by byroot (Jean Boussier) 3 months ago
- Status changed from Open to Rejected
Alright I've figured it out.
recv_io
creates an IO
instance, that you then discard. But that IO
instance once garbage collected is automatically closed.
So this issue can be reproduced much earlier by calling GC.start
just after UNIXSocket.for_fd(main_c.recv_io.fileno)
.
My suggestion fixes it, because it avoid creating two IOs for the same FD, so the GC doesn't end up closing the file descriptor.
You can also fix this by setting autoclose = false
on the IO instance returned by recv_io
.
But the cleaner way to do this is:
received_w = main_c.recv_io(UNIXSocket.)
So yeah, not a bug in ruby, but in your code.
Updated by danh337 (Dan H) 3 months ago
byroot (Jean Boussier) wrote in #note-2:
Alright I've figured it out.
[...]received_w = main_c.recv_io(UNIXSocket.)
So yeah, not a bug in ruby, but in your code.
WOW. @byroot (Jean Boussier) you are champion. I knew I needed to "cast" the received data from a plain IO to a UNIXSocket, but I didn't read the recv_io
docs, which clearly state how to do this. You made a great catch. Cheers to you. Sorry for taking your time.