Project

General

Profile

Actions

Bug #17527

closed

rb_io_wait_readable/writable with scheduler don't check errno

Added by ysbaddaden (Julien Portalier) about 3 years ago. Updated almost 3 years ago.

Status:
Closed
Target version:
-
ruby -v:
ruby 3.0.0p0 (2020-12-25 revision 95aff21468) [x86_64-linux]
[ruby-core:102003]

Description

Problem

Playing with the new Fiber Scheduler, I noticed that TCPServer#accept would hung forever after closing the server from another Fiber. I expected it to be resumed and fail with IOError, as it happens with threads.

Analysis

What happens is that the accept4 call in rsock_s_accept fails and sets errno to Errno::EBADF, it then checks a few memory/limit related errnos, then calls rb_io_wait_readable expecting it to handle the current errno for IO errors. But when a scheduler is set, it immediately delegates to Scheduler#io_wait and doesn't check the current errno! In my case (nio4r), the io_wait hook returns a ready state, which causes rsock_s_accept to loop forever.

I tried to manually check in the io_wait hook whether the IO is closed, but the fd is never updated (AFAIK never set to -1) so io.closed? is always false. I'm not sure schedulers should check whether the fd is closed, thought.

Proposed solution

A solution is to follow what happens for threads, and only check the scheduler when errno is EAGAIN or EWOULDBLOCK. I believe it's the only errors where we're expected to wait. This change also means that EINTR will be handled, too, and other errnos to raise an exception.

Instead of raising IOError.new("closed stream") as it happens for threads, it raises Errno::EBADF when a Scheduler is set. I suppose in the thread branches, it updates the IO at some point and calls rb_io_check_closed with the updated fd —maybe with GetOpenFile (RB_IO_POINTER) — and we ought to do the same at some point?

Another solution it to not delegate to the scheduler inside rb_io_wait_readable because it will eventually call rb_wait_for_single_fd that will check for the scheduler, but we can avoid some function calls, as well as thread-related debug information that could be confusing. It also won't raise help to raise IOError.

I'm attaching a patch that implements the first solution. It fixes both rb_io_wait_readable and rb_io_wait_writable since the latter may exhibit the same kind of issue in another scenario. This is speculative, I didn't hit one, yet.


Files

Actions #1

Updated by nobu (Nobuyoshi Nakada) about 3 years ago

  • Backport changed from 2.5: UNKNOWN, 2.6: UNKNOWN, 2.7: UNKNOWN, 3.0: UNKNOWN to 2.5: DONTNEED, 2.6: DONTNEED, 2.7: DONTNEED, 3.0: UNKNOWN

Updated by nobu (Nobuyoshi Nakada) about 3 years ago

Yet another solution is to check if in schedulers?

Updated by ioquatix (Samuel Williams) about 3 years ago

I checked the PR, I understand, thanks for the clear bug report.

I will review it in more detail, but it seems like a reasonable approach.

Updated by ioquatix (Samuel Williams) about 3 years ago

  • Assignee set to ioquatix (Samuel Williams)

Updated by ioquatix (Samuel Williams) almost 3 years ago

I think I ran into this bug on the write code path. I'm just going to confirm it.

Updated by ioquatix (Samuel Williams) almost 3 years ago

Okay, I confirmed the fix is required for async-io to use the native IO methods.

samuel@Fukurou ~/D/s/async-io (master) [1]> chruby ruby-3
samuel@Fukurou ~/D/s/async-io (master)> bundle exec rspec ./spec/async/io/stream_spec.rb:81
Run options: include {:locations=>{"./spec/async/io/stream_spec.rb"=>[81]}}

Async::IO::Stream
  socket I/O
    #drain_write_buffer
10.02s    error: Async::Task [oid=0x943e50] [pid=1398] [2021-03-30 20:00:10 +1300]
               |   Async::TimeoutError: run time exceeded duration 10s:
               |   #<Async::Reactor:0x943e28 2 children (running)>
               |   	#<Async::Task:0x943e3c timer task duration=10 (running)>
               |   	→ /Users/samuel/Documents/socketry/async/lib/async/task.rb:91:in `backtrace'
               |   	  /Users/samuel/Documents/socketry/async/lib/async/task.rb:91:in `backtrace'
               |   	  /Users/samuel/Documents/socketry/async/lib/async/node.rb:322:in `print_backtrace'
               |   	  /Users/samuel/Documents/socketry/async/lib/async/node.rb:315:in `block in print_hierarchy'
               |   	  /Users/samuel/Documents/socketry/async/lib/async/node.rb:298:in `traverse'
               |   	  /Users/samuel/Documents/socketry/async/lib/async/node.rb:301:in `block in traverse'
               |   	  /Users/samuel/Documents/socketry/async/lib/async/node.rb:88:in `each'
               |   	  /Users/samuel/Documents/socketry/async/lib/async/node.rb:300:in `traverse'
               |   	  /Users/samuel/Documents/socketry/async/lib/async/node.rb:310:in `print_hierarchy'
               |   	  /Users/samuel/.gem/ruby/3.0.0/gems/async-rspec-1.15.1/lib/async/rspec/reactor.rb:48:in `block in run_in_reactor'
               |   	  /Users/samuel/Documents/socketry/async/lib/async/task.rb:265:in `block in make_fiber'
               |   	#<Async::Task:0x943e50 RSpec::ExampleGroups::AsyncIOStream::SocketIO::DrainWriteBuffer (running)>
               |   	→ /Users/samuel/Documents/socketry/async/lib/async/task.rb:62:in `yield'
               |   	  /Users/samuel/Documents/socketry/async/lib/async/task.rb:62:in `yield'
               |   	  /Users/samuel/Documents/socketry/async/lib/async/condition.rb:40:in `wait'
               |   	  /Users/samuel/Documents/socketry/async/lib/async/task.rb:150:in `wait'
               |   	  /Users/samuel/.gem/ruby/3.0.0/gems/async-rspec-1.15.1/lib/async/rspec/reactor.rb:69:in `block in run_in_reactor'
               |   	  /Users/samuel/Documents/socketry/async/lib/async/task.rb:265:in `block in make_fiber'
               |   		#<Async::Task:0x943e64 running example (running)>
               |   		→ /Users/samuel/Documents/socketry/async/lib/async/task.rb:62:in `yield'
               |   		  /Users/samuel/Documents/socketry/async/lib/async/task.rb:62:in `yield'
               |   		  /Users/samuel/Documents/socketry/async/lib/async/condition.rb:40:in `wait'
               |   		  /Users/samuel/Documents/socketry/async/lib/async/task.rb:150:in `wait'
               |   		  /Users/samuel/Documents/socketry/async-io/spec/async/io/stream_spec.rb:90:in `block (5 levels) in <top (required)>'
               |   		  /Users/samuel/.gem/ruby/3.0.0/gems/rspec-expectations-3.10.1/lib/rspec/matchers/built_in/raise_error.rb:59:in `matches?'
               |   		  /Users/samuel/.gem/ruby/3.0.0/gems/rspec-expectations-3.10.1/lib/rspec/expectations/handler.rb:51:in `block in handle_matcher'
               |   		  /Users/samuel/.gem/ruby/3.0.0/gems/rspec-expectations-3.10.1/lib/rspec/expectations/handler.rb:27:in `with_matcher'
               |   		  /Users/samuel/.gem/ruby/3.0.0/gems/rspec-expectations-3.10.1/lib/rspec/expectations/handler.rb:48:in `handle_matcher'
               |   		  /Users/samuel/.gem/ruby/3.0.0/gems/rspec-expectations-3.10.1/lib/rspec/expectations/expectation_target.rb:65:in `to'
               |   		  /Users/samuel/.gem/ruby/3.0.0/gems/rspec-expectations-3.10.1/lib/rspec/expectations/expectation_target.rb:101:in `to'
               |   		  /Users/samuel/Documents/socketry/async-io/spec/async/io/stream_spec.rb:89:in `block (4 levels) in <top (required)>'
               |   		  /Users/samuel/.gem/ruby/3.0.0/gems/rspec-core-3.10.1/lib/rspec/core/example.rb:262:in `instance_exec'
               |   		  /Users/samuel/.gem/ruby/3.0.0/gems/rspec-core-3.10.1/lib/rspec/core/example.rb:262:in `block in run'
               |   		  /Users/samuel/.gem/ruby/3.0.0/gems/rspec-core-3.10.1/lib/rspec/core/example.rb:508:in `block in with_around_and_singleton_context_hooks'
               |   		  /Users/samuel/.gem/ruby/3.0.0/gems/rspec-core-3.10.1/lib/rspec/core/example.rb:465:in `block in with_around_example_hooks'
               |   		  /Users/samuel/.gem/ruby/3.0.0/gems/rspec-core-3.10.1/lib/rspec/core/hooks.rb:486:in `block in run'
               |   		  /Users/samuel/.gem/ruby/3.0.0/gems/rspec-core-3.10.1/lib/rspec/core/hooks.rb:626:in `block in run_around_example_hooks_for'
               |   		  /Users/samuel/.gem/ruby/3.0.0/gems/rspec-core-3.10.1/lib/rspec/core/example.rb:350:in `call'
               |   		  /Users/samuel/.gem/ruby/3.0.0/gems/async-rspec-1.15.1/lib/async/rspec/reactor.rb:92:in `block (3 levels) in <module:RSpec>'
               |   		  /Users/samuel/.gem/ruby/3.0.0/gems/async-rspec-1.15.1/lib/async/rspec/reactor.rb:61:in `block (2 levels) in run_in_reactor'
               |   		  /Users/samuel/Documents/socketry/async/lib/async/task.rb:265:in `block in make_fiber'
               |   			#<Async::Task:0x943e78 (running)>
               |   			→ /Users/samuel/Documents/socketry/async/lib/async/task.rb:62:in `yield'
               |   			  /Users/samuel/Documents/socketry/async/lib/async/task.rb:62:in `yield'
               |   			  /Users/samuel/Documents/socketry/async/lib/async/wrapper.rb:233:in `wait_for'
               |   			  /Users/samuel/Documents/socketry/async/lib/async/wrapper.rb:139:in `wait_writable'
               |   			  /Users/samuel/Documents/socketry/async/lib/async/scheduler.rb:61:in `io_wait'
               |   			  /Users/samuel/.rubies/ruby-3.0.0/lib/ruby/3.0.0/forwardable.rb:238:in `write'
               |   			  /Users/samuel/.rubies/ruby-3.0.0/lib/ruby/3.0.0/forwardable.rb:238:in `write'
               |   			  /Users/samuel/Documents/socketry/async-io/lib/async/io/stream.rb:160:in `block in flush'
               |   			  /Users/samuel/Documents/socketry/async/lib/async/semaphore.rb:80:in `acquire'
               |   			  /Users/samuel/Documents/socketry/async-io/lib/async/io/stream.rb:155:in `flush'
               |   			  /Users/samuel/Documents/socketry/async-io/spec/async/io/stream_spec.rb:86:in `block (5 levels) in <top (required)>'
               |   			  /Users/samuel/Documents/socketry/async/lib/async/task.rb:265:in `block in make_fiber'
               |   → /Users/samuel/.gem/ruby/3.0.0/gems/async-rspec-1.15.1/lib/async/rspec/reactor.rb:51 in `block in run_in_reactor'
               |     /Users/samuel/Documents/socketry/async/lib/async/task.rb:265 in `block in make_fiber'
      handles write failures (FAILED - 1)

Finished in 10.05 seconds (files took 0.30495 seconds to load)
1 example, 1 failure

Failed examples:

rspec ./spec/async/io/stream_spec.rb:81 # Async::IO::Stream socket I/O #drain_write_buffer handles write failures

samuel@Fukurou ~/D/s/async-io (master) [1]> chruby ruby-head
samuel@Fukurou ~/D/s/async-io (master)> bundle exec rspec ./spec/async/io/stream_spec.rb:81
warning: parser/current is loading parser/ruby30, which recognizes
warning: 3.0.x-compliant syntax, but you are running 3.1.0.
warning: please see https://github.com/whitequark/parser#compatibility-with-ruby-mri.
Run options: include {:locations=>{"./spec/async/io/stream_spec.rb"=>[81]}}

Async::IO::Stream
  socket I/O
    #drain_write_buffer
rb_io_wait_writable(13) -> errno=32
  0.0s    error: Async::Task [oid=0x11e4] [pid=1457] [2021-03-30 20:01:06 +1300]
               |   Errno::EPIPE: Broken pipe
               |   → /Users/samuel/.rubies/ruby-head/lib/ruby/3.1.0/forwardable.rb:238 in `write'
               |     /Users/samuel/.rubies/ruby-head/lib/ruby/3.1.0/forwardable.rb:238 in `write'
               |     /Users/samuel/Documents/socketry/async-io/lib/async/io/stream.rb:160 in `block in flush'
               |     /Users/samuel/Documents/socketry/async/lib/async/semaphore.rb:80 in `acquire'
               |     /Users/samuel/Documents/socketry/async-io/lib/async/io/stream.rb:155 in `flush'
               |     /Users/samuel/Documents/socketry/async-io/spec/async/io/stream_spec.rb:86 in `block (5 levels) in <top (required)>'
               |     /Users/samuel/Documents/socketry/async/lib/async/task.rb:265 in `block in make_fiber'
      handles write failures

Finished in 0.04441 seconds (files took 0.46472 seconds to load)
1 example, 0 failures

This happens when using a socket pair, explicitly closing one end, and trying to write to the other end.

Updated by ioquatix (Samuel Williams) almost 3 years ago

  • Backport changed from 2.5: DONTNEED, 2.6: DONTNEED, 2.7: DONTNEED, 3.0: UNKNOWN to 2.5: DONTNEED, 2.6: DONTNEED, 2.7: DONTNEED, 3.0: REQUIRED

I would advise we should back port this to 3.0.1 if possible. The original patch attached to this issue is sufficient.

Actions #10

Updated by ioquatix (Samuel Williams) almost 3 years ago

  • Status changed from Open to Closed

Updated by ioquatix (Samuel Williams) almost 3 years ago

@nagachika (Tomoyuki Chikanaga) can we please backport this for 3.0.2?

Updated by nagachika (Tomoyuki Chikanaga) almost 3 years ago

  • Backport changed from 2.5: DONTNEED, 2.6: DONTNEED, 2.7: DONTNEED, 3.0: REQUIRED to 2.5: DONTNEED, 2.6: DONTNEED, 2.7: DONTNEED, 3.0: DONE

ruby_3_0 13f93ad16d3d1ecf96ece229cd4bc5ea294e1a71 merged revision(s) 611e711085c7e3984555a79626d025c8b876eced,a9c5c2d614f30a616970245fef3e7ffc151e2ecf.

Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0