https://bugs.ruby-lang.org/https://bugs.ruby-lang.org/favicon.ico?17113305112018-04-12T03:21:07ZRuby Issue Tracking SystemRuby master - Bug #14681: `syswrite': stream closed in another thread (IOError)https://bugs.ruby-lang.org/issues/14681?journal_id=714632018-04-12T03:21:07Zioquatix (Samuel Williams)samuel@oriontransfer.net
<ul></ul><p>I confirmed the issue also applies to 2.5.1</p>
<p>I add a mutex to the write and close operations (which should not be necessary) and it "fixes" the issue.</p>
<pre><code>100.times.collect do
Thread.new do
mutex = Mutex.new
input, output = IO.pipe
worker = Thread.new do
sleep(0.1)
mutex.synchronize do
output.syswrite('.')
end
end
input.read(1)
mutex.synchronize do
input.close
output.close
end
worker.join
end
end.each(&:join)
</code></pre> Ruby master - Bug #14681: `syswrite': stream closed in another thread (IOError)https://bugs.ruby-lang.org/issues/14681?journal_id=715912018-04-21T00:33:17Znormalperson (Eric Wong)normalperson@yhbt.net
<ul></ul><p>Thanks, fixing now.</p> Ruby master - Bug #14681: `syswrite': stream closed in another thread (IOError)https://bugs.ruby-lang.org/issues/14681?journal_id=715922018-04-21T03:32:51Znormalperson (Eric Wong)normalperson@yhbt.net
<ul></ul><p><a href="mailto:samuel@oriontransfer.org" class="email">samuel@oriontransfer.org</a> wrote:</p>
<blockquote>
<p>Bug <a class="issue tracker-1 status-1 priority-4 priority-default" title="Bug: `syswrite': stream closed in another thread (IOError) (Open)" href="https://bugs.ruby-lang.org/issues/14681">#14681</a>: `syswrite': stream closed in another thread (IOError)<br>
<a href="https://bugs.ruby-lang.org/issues/14681" class="external">https://bugs.ruby-lang.org/issues/14681</a></p>
</blockquote>
<p>There's two bugs, here, I think. I made r63216 because it<br>
became obvious to me with the timeline shown in the commit message:</p>
<p><a href="https://80x24.org/spew/20180421024614.7362-1-e@80x24.org/raw" class="external">https://80x24.org/spew/20180421024614.7362-1-e@80x24.org/raw</a></p>
<p>However, I have a work-in-progress fix which probably requires<br>
API rework for rb_thread_io_blocking_region:</p>
<p><a href="https://80x24.org/spew/20180421021502.31552-1-e@80x24.org/" class="external">https://80x24.org/spew/20180421021502.31552-1-e@80x24.org/</a><br>
Note: the /* TODO: check func() */ in rb_thread_io_blocking_region<br>
But that WIP patch is broken...</p>
<p>I think we need to replace rb_thread_io_blocking_region to<br>
permanently fix your problem. My change to check "val != Qundef"<br>
is insufficient and wrong. But I don't know how reasonable close<br>
notifications can be with APIs like IO.copy_stream and<br>
IO.select which work on multiple IOs, even...</p>
<p>However, you can work around the problem simply:</p>
<blockquote>
<p>100.times.collect do<br>
Thread.new do<br>
input, output = IO.pipe</p>
<pre><code> worker = Thread.new do
sleep(0.1)
output.syswrite('.')
end
input.read(1)
input.close
output.close
worker.join
</code></pre>
</blockquote>
<p>You should be able to rearrange the order of the last two calls<br>
so worker.join happens before output.close:</p>
<pre><code> worker.join
output.close
</code></pre>
<p>Thats should avoid the problem described in<br>
<a href="https://80x24.org/spew/20180421021502.31552-1-e@80x24.org/" class="external">https://80x24.org/spew/20180421021502.31552-1-e@80x24.org/</a></p>
<blockquote>
<p>end<br>
end.each(&:join)</p>
<pre><code></code></pre>
</blockquote> Ruby master - Bug #14681: `syswrite': stream closed in another thread (IOError)https://bugs.ruby-lang.org/issues/14681?journal_id=715942018-04-21T09:04:12Znormalperson (Eric Wong)normalperson@yhbt.net
<ul></ul><p>Eric Wong <a href="mailto:normalperson@yhbt.net" class="email">normalperson@yhbt.net</a> wrote:</p>
<blockquote>
<p><a href="https://80x24.org/spew/20180421021502.31552-1-e@80x24.org/" class="external">https://80x24.org/spew/20180421021502.31552-1-e@80x24.org/</a><br>
Note: the /* TODO: check func() */ in rb_thread_io_blocking_region<br>
But that WIP patch is broken...</p>
<p>I think we need to replace rb_thread_io_blocking_region to<br>
permanently fix your problem. My change to check "val != Qundef"<br>
is insufficient and wrong. But I don't know how reasonable close<br>
notifications can be with APIs like IO.copy_stream and<br>
IO.select which work on multiple IOs, even...</p>
</blockquote>
<p>Btw, this problem is present since 1.9.3 at least, probably<br>
earlier; and your original script fails on older versions as<br>
well. (but they seem OK with my proposed reordering to<br>
worker.join before output.close)</p>
<p>So maybe it's not so urgent and we can take our time with API<br>
design.</p>
<p>I am thinking of replacing all GVL release functions with<br>
something which takes an opaque attr like pthread_create uses.<br>
It's a bit verbose, but should be extensible:</p>
<pre><code>--- a/io.c
+++ b/io.c
@@ -983,11 +983,19 @@ static ssize_t
rb_write_internal(int fd, const void *buf, size_t count)
{
struct io_internal_write_struct iis;
+ ssize_t w;
+ rb_thread_run_t tr;
+ rb_thread_attr_t attr;
iis.fd = fd;
iis.buf = buf;
iis.capa = count;
- return (ssize_t)rb_thread_io_blocking_region(internal_write_func, &iis, fd);
+ rb_thread_attr_init(&attr, 1);
+ rb_thread_attr_set_fds(&attr, 1, &fd);
+ rb_thread_attr_set_ubf(&attr, RUBY_UBF_IO, NULL);
+ rb_thread_run_do(&tr, &attr, internal_write_func, &iis);
+ rb_thread_run_join(tr, &w);
+ return w;
}
static ssize_t
</code></pre>
<pre><code>rb_thread_attr_set_fds would allow setting multiple FDs
for functions like sendfile/splice/copy_file_range/tee,
something rb_thread_io_blocking_region can't do.
(but probably not used for poll/select/ppoll)
</code></pre>
<p>rb_thread_attr_set_ubf - to set unblocking function + arg<br>
as with current rb_thread_call_without_gvl2</p>
<p>rb_thread_attr_set_intrfail - set fail-if-interrupted flag<br>
This will allow removing confusing difference between<br>
rb_thread_call_without_gvl and rb_thread_call_without_gvl2<br>
calls (which does which again?)</p>
<p>Future expansion:</p>
<p>rb_thread_attr_set_bind - allow migrating to different native thread</p> Ruby master - Bug #14681: `syswrite': stream closed in another thread (IOError)https://bugs.ruby-lang.org/issues/14681?journal_id=715982018-04-21T13:14:26Zioquatix (Samuel Williams)samuel@oriontransfer.net
<ul></ul><p>Eric, thanks so much for your commitment to fixing this issue and for taking a look at my specific use case.</p>
<p>I will try out your suggestions. If it works, it's good enough.</p>
<p>I look forward to your future bug fix.</p> Ruby master - Bug #14681: `syswrite': stream closed in another thread (IOError)https://bugs.ruby-lang.org/issues/14681?journal_id=715992018-04-21T13:18:59Zioquatix (Samuel Williams)samuel@oriontransfer.net
<ul></ul><p>I reviewed your suggestion, and while it (in theory) works with the original example, it won't work with my actual use case. I have a fixed number of worker threads reading from a shared job queue and executing jobs - when the job is completed, the IO is triggered and on the waiting end the IO is closed once the input is read - there is no join except when the thread pool is stopped.</p> Ruby master - Bug #14681: `syswrite': stream closed in another thread (IOError)https://bugs.ruby-lang.org/issues/14681?journal_id=716002018-04-21T13:26:11Zioquatix (Samuel Williams)samuel@oriontransfer.net
<ul></ul><p>Excuse my ignorance, but if you call write, why can't you just directly invoke <code>::write</code>? Why do you need to do <code>rb_thread_io_blocking_region</code>?</p> Ruby master - Bug #14681: `syswrite': stream closed in another thread (IOError)https://bugs.ruby-lang.org/issues/14681?journal_id=716012018-04-21T18:03:54Znormalperson (Eric Wong)normalperson@yhbt.net
<ul></ul><p><a href="mailto:samuel@oriontransfer.org" class="email">samuel@oriontransfer.org</a> wrote:</p>
<blockquote>
<p>Excuse my ignorance, but if you call write, why can't you just<br>
directly invoke <code>::write</code>? Why do you need to do<br>
<code>rb_thread_io_blocking_region</code>?</p>
</blockquote>
<p>rb_thread_io_blocking_region releases the GVL because write(2) may<br>
block on slow filesystem, full pipe/sockets, etc.</p>
<p>We no longer set O_NONBLOCK on sockets/pipes by default since<br>
1.9+; and but that didn't help with slow filesystems whose<br>
buffers are full (or using weird stuff like O_SYNC/O_DIRECT).</p>
<p>rb_thread_io_blocking_region is slightly different than<br>
rb_thread_call_without_gvl because Ruby has traditionally<br>
signaled cross-thread IO#close with IOError instead of undefined<br>
platform-specific behavior (sometimes EBADF, sometimes blocks<br>
for a long time and succeeds, ...). I'm not sure I actually<br>
agree with this behavior, just stating what things are.</p> Ruby master - Bug #14681: `syswrite': stream closed in another thread (IOError)https://bugs.ruby-lang.org/issues/14681?journal_id=716022018-04-21T18:12:42Znormalperson (Eric Wong)normalperson@yhbt.net
<ul></ul><p><a href="mailto:samuel@oriontransfer.org" class="email">samuel@oriontransfer.org</a> wrote:</p>
<blockquote>
<p>I reviewed your suggestion, and while it (in theory) works<br>
with the original example, it won't work with my actual use<br>
case which uses are set of shared threads to implement<br>
background workers - there is no join except when the thread<br>
pool is stopped.</p>
</blockquote>
<p>If its threads background workers, perhaps SizedQueue is<br>
better (and probably faster since 2.5).</p>
<p>Or, why bother closing the output early?</p> Ruby master - Bug #14681: `syswrite': stream closed in another thread (IOError)https://bugs.ruby-lang.org/issues/14681?journal_id=716042018-04-21T22:12:52Zioquatix (Samuel Williams)samuel@oriontransfer.net
<ul></ul><blockquote>
<p>We no longer set O_NONBLOCK on sockets/pipes by default since<br>
1.9+; and but that didn't help with slow filesystems whose<br>
buffers are full (or using weird stuff like O_SYNC/O_DIRECT).</p>
</blockquote>
<p>Can I set this to avoid the breaking code path? I know the write would succeed every time in this case.</p>
<blockquote>
<p>signaled cross-thread IO#close</p>
</blockquote>
<p>Do you mind explaining what this is? Do you mean if one thread is calling <code>read</code> and then another thread calls <code>close</code> while <code>read</code> is still in progress?</p>
<blockquote>
<p>If its threads background workers, perhaps SizedQueue is<br>
better (and probably faster since 2.5).</p>
</blockquote>
<p>I did't know about this. Thanks for your suggestion, I will review it.</p> Ruby master - Bug #14681: `syswrite': stream closed in another thread (IOError)https://bugs.ruby-lang.org/issues/14681?journal_id=716052018-04-22T00:12:40Znormalperson (Eric Wong)normalperson@yhbt.net
<ul></ul><p><a href="mailto:samuel@oriontransfer.org" class="email">samuel@oriontransfer.org</a> wrote:</p>
<blockquote>
<blockquote>
<p>We no longer set O_NONBLOCK on sockets/pipes by default since<br>
1.9+; and but that didn't help with slow filesystems whose<br>
buffers are full (or using weird stuff like O_SYNC/O_DIRECT).</p>
</blockquote>
</blockquote>
<blockquote>
<p>Can I set this to avoid the breaking code path? I know the<br>
write would succeed every time in this case.</p>
</blockquote>
<p>In that case, it sounds like you can use write_nonblock instead<br>
of syswrite. *_nonblock functions do not release GVL.</p>
<blockquote>
<blockquote>
<p>signaled cross-thread IO#close</p>
</blockquote>
<p>Do you mind explaining what this is? Do you mean if one thread<br>
is calling <code>read</code> and then another thread calls <code>close</code> while<br>
<code>read</code> is still in progress?</p>
</blockquote>
<p>Exactly</p> Ruby master - Bug #14681: `syswrite': stream closed in another thread (IOError)https://bugs.ruby-lang.org/issues/14681?journal_id=995232022-10-08T11:48:00ZEregon (Benoit Daloze)
<ul><li><strong>Related to</strong> <i><a class="issue tracker-1 status-1 priority-4 priority-default" href="/issues/18455">Bug #18455</a>: `IO#close` has poor performance and difficult to understand semantics.</i> added</li></ul>