Project

General

Profile

Actions

Feature #11607

closed

[PATCH] fiddle: release GVL for ffi_call

Added by normalperson (Eric Wong) over 8 years ago. Updated over 8 years ago.

Status:
Closed
Target version:
-
[ruby-core:71121]

Description

Some external functions I wish to call may take a long time
and unnecessarily block other threads. This may lead to performance
regressions for fast functions as releasing/acquiring the GVL is not
cheap, but can improve performance for long-running functions
in multi-threaded applications.

This also means we must reacquire the GVL when calling Ruby-defined
callbacks for Fiddle::Closure, meaning we must detect whether the
current thread has the GVL by exporting ruby_thread_has_gvl_p
in internal.h


Files

Updated by normalperson (Eric Wong) over 8 years ago

v2 fixes a warning I did not notice before, interdiff:

--- a/ext/fiddle/closure.c
+++ b/ext/fiddle/closure.c
@@ -64,7 +64,7 @@ struct callback_args {
     void *ctx;
 };
 
-static void
+static void *
 with_gvl_callback(void *ptr)
 {
     struct callback_args *x = ptr;
@@ -177,6 +177,7 @@ with_gvl_callback(void *ptr)
       default:
 	rb_raise(rb_eRuntimeError, "closure retval: %d", type);
     }
+    return 0;
 }
 
 static void

Updated by kosaki (Motohiro KOSAKI) over 8 years ago

On Tue, Oct 20, 2015 at 6:28 PM, wrote:

Issue #11607 has been updated by Eric Wong.

File fiddle-release-GVL-for-ffi_call-v2.patch added

v2 fixes a warning I did not notice before, interdiff:

--- a/ext/fiddle/closure.c
+++ b/ext/fiddle/closure.c
@@ -64,7 +64,7 @@ struct callback_args {
     void *ctx;
 };

-static void
+static void *
 with_gvl_callback(void *ptr)
 {
     struct callback_args *x = ptr;
@@ -177,6 +177,7 @@ with_gvl_callback(void *ptr)
       default:
        rb_raise(rb_eRuntimeError, "closure retval: %d", type);
     }
+    return 0;

This interdiff is really ugly to me. Do we really have no other way?

Updated by normalperson (Eric Wong) over 8 years ago

KOSAKI Motohiro wrote:

On Tue, Oct 20, 2015 at 6:28 PM, wrote:

Issue #11607 has been updated by Eric Wong.

File fiddle-release-GVL-for-ffi_call-v2.patch added

v2 fixes a warning I did not notice before, interdiff:

--- a/ext/fiddle/closure.c
+++ b/ext/fiddle/closure.c
@@ -64,7 +64,7 @@ struct callback_args {
     void *ctx;
 };

-static void
+static void *
 with_gvl_callback(void *ptr)
 {
     struct callback_args *x = ptr;
@@ -177,6 +177,7 @@ with_gvl_callback(void *ptr)
       default:
        rb_raise(rb_eRuntimeError, "closure retval: %d", type);
     }
+    return 0;

This interdiff is really ugly to me. Do we really have no other way?

I'm not sure what you mean. I made the change to match the signature of
rb_thread_call_with_gvl, and I think rb_thread_call_with_gvl is a
reasonable API.

I could do s/0/NULL/ if that's what you mean. I don't have a strong
opinion on '0' vs 'NULL', and Ruby source seems to use both.

Updated by kosaki (Motohiro KOSAKI) over 8 years ago

Nevermind. I did misinterpret your code. I'm ok either.

Updated by naruse (Yui NARUSE) over 8 years ago

Calling function is really always MT-safe?
The user of fiddle must check whether the function is MT-safe and maintain fine grained lock if it's not MT-safe?

Updated by normalperson (Eric Wong) over 8 years ago

Yes, user must check if the function is MT-safe. Probably fine
for most library functions...

Maybe releasing GVL can be optional, but I would rather avoid
exposing implementation detail or new APIs...

Updated by tenderlovemaking (Aaron Patterson) over 8 years ago

On Tue, Oct 27, 2015 at 08:54:07AM +0000, Eric Wong wrote:

Yes, user must check if the function is MT-safe. Probably fine
for most library functions...

Maybe releasing GVL can be optional, but I would rather avoid
exposing implementation detail or new APIs...

I think it's fine. Calling a C function from fiddle that requires the
GVL to be locked seems like an edge case. Maybe we can make an option
to maintain keep the GVL locked, and make "unlocking the GVL" default
behavior.

--
Aaron Patterson
http://tenderlovemaking.com/

Updated by normalperson (Eric Wong) over 8 years ago

Aaron Patterson wrote:

On Tue, Oct 27, 2015 at 08:54:07AM +0000, Eric Wong wrote:

Yes, user must check if the function is MT-safe. Probably fine
for most library functions...

Maybe releasing GVL can be optional, but I would rather avoid
exposing implementation detail or new APIs...

I think it's fine. Calling a C function from fiddle that requires the
GVL to be locked seems like an edge case. Maybe we can make an option
to maintain keep the GVL locked, and make "unlocking the GVL" default
behavior.

AFAIK, fiddle does not have many users right now[1], correct?
If Ruby eventually loses the GVL, they could get screwed later on if
they rely on GVL, too. So any potentially breaking change should be
sooner rather than later.

But maybe we introduce a new API/option now to release GVL now.
If/when Ruby loses the GVL; we implement a GFL (Global Fiddle Lock)
and use GFL as default behavior; while the API/option to release
the GVL releases the GFL instead.

I also have some ideas to hopefully make the current GVL cheaper.

[1] I'm not entirely sure why fiddle was introduced since the 'ffi'
gem was already prevalent. Was it to keep compatibility with
'dl'? Well, at least I can contribute to fiddle without dealing
ith a non-Free SaaS.

Updated by normalperson (Eric Wong) over 8 years ago

So, should I commit the patch as-is, or perhaps add a new
option/method for releasing the GVL?

Naming new options/methods hard for me :<

Updated by tenderlovemaking (Aaron Patterson) over 8 years ago

On Fri, Nov 13, 2015 at 05:08:36AM +0000, Eric Wong wrote:

So, should I commit the patch as-is, or perhaps add a new
option/method for releasing the GVL?

Naming new options/methods hard for me :<

Yes, please commit it! :)

--
Aaron Patterson
http://tenderlovemaking.com/

Actions #11

Updated by Anonymous over 8 years ago

  • Status changed from Open to Closed

Applied in changeset r52723.


fiddle: release GVL for ffi_call

Some external functions I wish to call may take a long time
and unnecessarily block other threads. This may lead to performance
regressions for fast functions as releasing/acquiring the GVL is not
cheap, but can improve performance for long-running functions
in multi-threaded applications.

This also means we must reacquire the GVL when calling Ruby-defined
callbacks for Fiddle::Closure, meaning we must detect whether the
current thread has the GVL by exporting ruby_thread_has_gvl_p
in internal.h

  • ext/fiddle/function.c (struct nogvl_ffi_call_args):
    new struct for GVL release
    (nogvl_ffi_call): new function
    (function_call): adjust for GVL release
    [ruby-core:71642] [Feature #11607]
  • ext/fiddle/closure.c (struct callback_args):
    new struct for GVL acquire
    (with_gvl_callback): adjusted original callback function
    (callback): wrapper for conditional GVL acquire
  • ext/fiddle/depend: add dependencies
  • ext/fiddle/extconf.rb: include top_srcdir for internal.h
  • internal.h (ruby_thread_has_gvl_p): expose for fiddle
  • vm_core.h (ruby_thread_has_gvl_p): moved to internal.h
  • test/fiddle/test_function.rb (test_nogvl_poll): new test

Updated by ngoto (Naohisa Goto) over 8 years ago

After r52723, SEGV occurred during Fiddle::TestFunc#test_qsort1 test/fiddle/test_func.rb:83 on Solaris 10 i386 on RubyCI.

(r52725) http://rubyci.s3.amazonaws.com/unstable10x/ruby-trunk/log/20151123T224815Z.fail.html.gz

Because this occurred in the CI environment, no other detailed information was available.

This is not reproduced on sparc Solaris.

Updated by normalperson (Eric Wong) over 8 years ago

wrote:

After r52723, SEGV occurred during Fiddle::TestFunc#test_qsort1 test/fiddle/test_func.rb:83 on Solaris 10 i386 on RubyCI.

http://rubyci.s3.amazonaws.com/unstable10x/ruby-trunk/log/20151123T224815Z.fail.html.gz

Because this occurred in the CI environment, no other detailed information was available.

This is not reproduced on sparc Solaris.

Any chance you can log into the i386 VM and reproduce the error
in a standalone script?

In case the qsort implementation on Solaris has global state,
I wouldn't expect our test suite would be running other threads
also calling qsort and messing up the global state.

Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0