Project

General

Profile

Actions

Bug #14642

closed

Fiber make crash on Windows - webrick/httpproxy.rb ?

Added by usa (Usaku NAKAMURA) about 6 years ago. Updated about 6 years ago.

Status:
Closed
Target version:
-
ruby -v:
trunk r62966
[ruby-core:86376]

Description

We found that r62966 causes crash on Windows.
Of course, the patch seems to be no problem, so I guess that this code just happened to reveal a potential bug.

FYI, AppVayor and mswinci both reported the crash, but they didn't show any details.

Stable versions don't contain the patch, but will be backported later, maybe.

Updated by MSP-Greg (Greg L) about 6 years ago

  • Subject changed from Fiber make crash on Windows to Fiber make crash on Windows - webrick/httpproxy.rb ?

I noticed this with ruby-loco MinGW on 62969. I did some checking last night (-0500), and rechecked this morning using 63034.

The following tests cause silent fails:

test/webrick/test_httpproxy.rb
  #test_proxy
  #test_big_bodies
  #test_upstream_proxy

test/open-uri/test_open-uri.rb
  #test_proxy
  #test_proxy_http_basic_authentication_success
  #test_authenticated_proxy_http_basic_authentication_success

In httpproxy.rb, the following method exists. I added puts before and after the perform_proxy_request call, and the return puts was never output. I mocked up a fiber example, as I've seen an issue in windows with Continuation being used in an enum block, but I couldn't see anything wrong with that. Maybe something with the Lambda/socket code...

def do_GET(req, res)
  perform_proxy_request(req, res, Net::HTTP::Get)
end

Thanks, Greg

Updated by MSP-Greg (Greg L) about 6 years ago

Apologies for the previous post, it's not the clearest...

If the above listed tests are disabled, testing the open-uri and webrick folders completes with no fails or errors.

If any of the above listed tests are enabled, testing silently stops (as in SEGV) with no test summary, and returns to a command prompt.

I believe the code causing the issue is WEBrick::HTTPProxyServer#perform_proxy_request, shown here https://github.com/ruby/ruby/blob/46b391ff731d66883082e6347d5fc4e54386d7bd/lib/webrick/httpproxy.rb#L298-L347.

Thanks, Greg

Updated by MSP-Greg (Greg L) about 6 years ago

I did a vc14 build (ruby 2.6.0dev (2018-03-31 trunk 63043) [x64-mswin64_140]), it behaves the same as the MinGW build.

Using the test TestWEBrickHTTPProxy#test_proxy in test/webrick/test_httpproxy.rb, when HTTPProxyServer#perform_proxy_request (lib/webrick/httpproxy.rb) is called

link - https://github.com/ruby/ruby/blob/46b391ff731d66883082e6347d5fc4e54386d7bd/lib/webrick/httpproxy.rb#L298-L347.

The code runs to line 334, the Fiber.yield statement, and stops, as in back to command prompt, with no indication of the problem. The fiber is first started by req_fib.resume on line 339.

Thanks, Greg

Updated by MSP-Greg (Greg L) about 6 years ago

I have tried to isolate the issue with Fiber, but haven't succeeded.

Please look at the two below (old) issues, which also involve cont.c (sorry for the duplication):
https://bugs.ruby-lang.org/issues/13298
https://bugs.ruby-lang.org/issues/13485

Using a current MinGW build, the following still generates a SEGV (it is one of two patches used in ruby-loco to allow it to pass test-all, the other is readline 'meta key' issue):

ruby -W --disable-gems runner.rb -I./lib ruby/test_enum.rb -ntest_callcc

The test is passing on Ruby Appveyor & Travis, and RubyCI.org, and also passes on a recent, locally created vc14 build, ruby 2.6.0dev (2018-03-31 trunk 63043) [x64-mswin64_140].

Given that, and given that MinGW is somewhat of a hybrid between *nix and Windows, might cont.c be 'misclassifying' MinGW somewhere in its code? Just a guess, as I don't know c...

Finally, for the time being, I have reverted 62966 (see https://github.com/MSP-Greg/ruby-loco/blob/master/patches/gte20600/windows_fiber_webrick_proxy.patch), so ruby-loco can be used for testing more recent commits and CI testing in other repositories.

Thanks, Greg

Updated by naruse (Yui NARUSE) about 6 years ago

I reverted r62966 and r62969 to keep CI green.
Could you re-commit them when you commit the fix?

Actions #6

Updated by ko1 (Koichi Sasada) about 6 years ago

  • Status changed from Assigned to Closed

Applied in changeset trunk|r63073.


Fix Fiber with Thread issue on Windows [Bug #14642]

  • cont.c (rb_threadptr_root_fiber_setup): divide into two functions:

    • rb_threadptr_root_fiber_setup_by_parent(): called by the parent thread.
    • rb_threadptr_root_fiber_setup_by_child(): called by the created thread.

    rb_threadptr_root_fiber_setup() is called by the parent thread and
    set fib->fib_handle by ConvertThreadToFiber() on the parent thread on
    Windows enveironment.
    This means that root_fib->fib_handle of child thread is initialized
    with parent thread's Fiber handle. Furthermore, second call of
    ConvertThreadToFiber() for the same thread fails.

    This patch solves this weird situateion. However, maybe we can make more
    clean code.

  • thread.c (thread_start_func_2): call
    rb_threadptr_root_fiber_setup_by_child() at thread initialize routine.

  • vm.c (th_init): call rb_threadptr_root_fiber_setup_by_parent().

Updated by ko1 (Koichi Sasada) about 6 years ago

Sorry for late fixing.

The easy repro is here:

  def assert_create_fiber_in_new_thread
    ret = Thread.new{
      Thread.new{
        Fiber.new{Fiber.yield :ok}.resume
      }.join
    }.join
    assert_euqal :ok, ret, '[Bug #14642]'
  end

And I believe r63073 will solve this issue. I'll clean this code later.

Eric, I repatch your webrick patch at r63074. Please fix if I missed anything....

Thanks,
Koichi

Updated by MSP-Greg (Greg L) about 6 years ago

@ko1 (Koichi Sasada)

Thanks for the work. The test is not being run as assert_euqal contains a small typo. See GitHub PR 1855.

Greg

Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0Like0Like0Like0Like0