Feature #5543
closedrb_thread_blocking_region() API is poorly designed
Added by cjameshuff (Christopher Huff) about 13 years ago. Updated over 12 years ago.
Description
First, rb_thread_blocking_region() requires the blocking code to be pulled out into a separate function, scattering code through the source file and giving the coder more work to do to pass information through to that function. Something like rb_thread_blocking_region_begin() and rb_thread_blocking_region_end(), or the BLOCKING_REGION macro used to implement rb_thread_blocking_region(), would be far more convenient to use, but were apparently deprecated and are now only usable within thread.c.
Worse, the function passed to rb_thread_blocking_region() must return a Ruby VALUE, but also must execute without a VM lock. It is rather nonsensical to specify that a function return a Ruby object while forbidding it from accessing most of Ruby. It is likely the function won't touch the anything related to Ruby at all, and while you can use casting to work around it, you shouldn't have to. The main result of all this is less readable and even somewhat misleading code.
Updated by normalperson (Eric Wong) about 13 years ago
Christopher Huff cjameshuff@gmail.com wrote:
First, rb_thread_blocking_region() requires the blocking code to be
pulled out into a separate function, scattering code through the
source file and giving the coder more work to do to pass information
through to that function. Something like
rb_thread_blocking_region_begin() and rb_thread_blocking_region_end(),
or the BLOCKING_REGION macro used to implement
rb_thread_blocking_region(), would be far more convenient to use, but
were apparently deprecated and are now only usable within thread.c.
rb_thread_blocking_region_begin() and rb_thread_blocking_region_end()
require malloc()/free(), so there may be a performance impact there
for some code.
The BLOCKING_REGION macros make it too easy to break ABI compatibility
and require users to rebuild 3rd-party libraries. ABI compatibility is
apparently an important thing for users on crippled OSes that don't
include free compilers :<
Worse, the function passed to rb_thread_blocking_region() must return
a Ruby VALUE, but also must execute without a VM lock. It is rather
nonsensical to specify that a function return a Ruby object while
forbidding it from accessing most of Ruby. It is likely the function
won't touch the anything related to Ruby at all, and while you can use
casting to work around it, you shouldn't have to. The main result of
all this is less readable and even somewhat misleading code.
No return type can possibly satisfy everyone who will use this function,
VALUE is probably the least bad since the object the VALUE refers to
could be created before entering the blocking region.
I had no hand/influence in the design of this API, but I feel it's the
best API given the circumstances (need for a GVL + compatibility).
Updated by cjameshuff (Christopher Huff) about 13 years ago
VALUE is actively misleading, given that a VALUE can not be constructed by the function, and the writer of the code will most likely not want to return a pre-constructed one. "void *" is the obvious choice, not carrying such an implication and making it clear that greater care is necessary when using the function...in particular, making it clear that what it returns can't be assumed to be a valid VALUE. If we're going to be stuck with an awkward API that forces us to spin off little side functions and cram data through a single input pointer and single return pointer, we should at least not have to deal with an API that lies to us.
Updated by mdalessio (Mike Dalessio) almost 13 years ago
Another issue with the current design is that it does not support
callback-oriented APIs.
First, a counter-example. The current API perfectly supports the
following pattern:
- unlock GVL
- perform an expensive (or blocking) operation
- re-lock the GVL when ready to re-enter RubyLand
For example:
/* calling pattern that is supported by the current API */
VALUE time_consuming_thing(void data)
{
/ ... something expensive ... */
}
/* native extension Ruby method */
static VALUE get_it_for_me(VALUE self, VALUE foo)
{
rb_thread_blocking_region(time_consuming_thing, 0, RUBY_UBF_IO, 0):
}
But if I am calling an event-oriented API (for example, an
subscribable database that invokes a callback when new records are
added), then I am stuck. An example of what I want to do:
static int read_callback(VALUE self, Event *event) {
rb_thread_blocking_region_end();
rb_funcall(callable, "callback", convert_to_ruby(event));
rb_thread_blocking_region_begin();
}
/* native extension Ruby method /
VALUE stream_events(VALUE self)
{
rb_thread_blocking_region_begin();
db->read(read_callback, self); / this call may block for minutes, and may call read_callback multiple times */
rb_thread_blocking_region_end();
}
It appears that the current implementation (1.9.3) of
rb_thread_blocking_region_begin() and _end() does in fact support this
calling style, but they are not global symbols, and so I cannot call
them from my C extension.
Updated by normalperson (Eric Wong) almost 13 years ago
Mike Dalessio mike.dalessio@gmail.com wrote:
It appears that the current implementation (1.9.3) of
rb_thread_blocking_region_begin() and _end() does in fact support this
calling style, but they are not global symbols, and so I cannot call
them from my C extension.
rb_thread_call_with_gvl() is globally-visible (but not in headers)
for 1.9.3: https://bugs.ruby-lang.org/issues/4328
Perhaps you can use that?
Updated by mdalessio (Mike Dalessio) almost 13 years ago
rb_thread_call_with_gvl() is globally-visible (but not in headers)
for 1.9.3: https://bugs.ruby-lang.org/issues/4328Perhaps you can use that?
That is exactly what I need.
Thanks so much for the advice. You made my day! <3 <3 <3
Updated by drbrain (Eric Hodel) almost 13 years ago
I'm using rb_thread_call_with_gvl as well to support GLUT (OpenGL toolkit) callbacks.
I'd like something like this to be officially supported, not just accessible.
Updated by ko1 (Koichi Sasada) over 12 years ago
- Assignee set to ko1 (Koichi Sasada)
- Target version set to 2.0.0
Christopher Huff wrote:
VALUE is actively misleading, given that a VALUE can not be constructed by the function, and the writer of the code will most likely not want to return a pre-constructed one. "void *" is the obvious choice, not carrying such an implication and making it clear that greater care is necessary when using the function...in particular, making it clear that what it returns can't be assumed to be a valid VALUE. If we're going to be stuck with an awkward API that forces us to spin off little side functions and cram data through a single input pointer and single return pointer, we should at least not have to deal with an API that lies to us.
Should we change it from VALUE to 'void *'?
It's not big compatibility issue, I think (maybe it causes several warnings at compiling time).
Updated by shyouhei (Shyouhei Urabe) over 12 years ago
- Status changed from Open to Assigned
Updated by ko1 (Koichi Sasada) over 12 years ago
- Status changed from Assigned to Feedback
ping: cjameshuff
Updated by larskanis1 (Lars Kanis) over 12 years ago
ko1 (Koichi Sasada) wrote:
Should we change it from VALUE to 'void *'?
IMHO VALUE is misleading and should be changed for rb_thread_blocking_region() and rb_blocking_function_t. But isn't it better to use "int" instead of "void*" ? Where should the "void*" point to? For data1/data2 a "void*" is obviously the right choice to pass a locally defined struct with parameters to (or back from) the blocking function. But the return should be a copied value instead of a pointer. The same applies to rb_thread_call_with_gvl().
When I used rb_thread_blocking_region(), I was wondering about the return type VALUE, too. So for instance in the pkcs11.gem the return value is passed via a struct through *data1 to avoid the use of misleading VALUE at all.
Updated by nobu (Nobuyoshi Nakada) over 12 years ago
"int" may not be large enough to keep a pointer.
Updated by ko1 (Koichi Sasada) over 12 years ago
(2012/07/03 18:41), nobu (Nobuyoshi Nakada) wrote:
"int" may not be large enough to keep a pointer.
+1.
--
// SASADA Koichi at atdot dot net
Updated by larskanis1 (Lars Kanis) over 12 years ago
OK, then use 'void *' like rb_thread_call_with_gvl().
Updated by ko1 (Koichi Sasada) over 12 years ago
(2012/07/08 5:06), larskanis1 (Lars Kanis) wrote:
OK, then use 'void *' like rb_thread_call_with_gvl().
We conclude this feature follows:
(1) Don't touch the declaration of rb_thread_blocking_region().
And mark it as obsolete.
(the name "blocking_region" is internal name. I think it was bad name
to expose.
BLOCKING_REGION(
// from here
do something without gvl
// to here
)
)
Do not use this API for newer extensions. And replace it with the
call_without_gvl if you can.
(2) Change the return type rb_thread_call_without_gvl()
void *
rb_thread_call_without_gvl(
void *(*func)(void *), void *data1,
rb_unblock_function_t *ubf, void *data2)
rb_thread_call_without_gvl() and rb_thread_call_with_gvl() is
experimental (not exposed officially) function. So we can expect that
only a few "compiling error" with it.
--
// SASADA Koichi at atdot dot net
Updated by nobu (Nobuyoshi Nakada) over 12 years ago
- Status changed from Feedback to Closed
- % Done changed from 0 to 100
This issue was solved with changeset r36355.
Christopher, thank you for reporting this issue.
Your contribution to Ruby is greatly appreciated.
May Ruby be with you.
rb_thread_call_without_gvl
- include/ruby/thread.h: new header file for thread stuff.
- thread.c (rb_thread_call_without_gvl): export. [Feature#4328]
returns void* instead of VALUE. [Feature #5543] - thread.c (rb_thread_blocking_region): deprecate. [ruby-core:46295]