Feature #3620

Add Queue, SIzedQueue and ConditionVariable implementations in C in addition to ruby ones

Added by Ricardo Panaggio over 3 years ago. Updated 7 months ago.

[ruby-core:31513]
Status:Closed
Priority:Normal
Assignee:Koichi Sasada
Category:ext
Target version:next minor

Description

=begin
Queue, SizedQueue and ConditionVariable are important synchronization primitives and are nowadays implemented in Ruby.

Attached patch (initiated by myself and heavily enriched by Nobu) contains these sync primitives implemented in C, which makes them faster (see [1] for the benchmark's code):

Rehearsal -------------------------------------------------
Q#push 1.590000 0.010000 1.600000 ( 1.605502)
T#push 0.600000 0.010000 0.610000 ( 0.630444)
Q#pop 4.390000 0.000000 4.390000 ( 4.389781)
T#pop 0.580000 0.000000 0.580000 ( 0.578918)
Q#empty? 0.480000 0.000000 0.480000 ( 0.484305)
T#empty? 0.360000 0.000000 0.360000 ( 0.358559)
Q#clear 1.210000 0.000000 1.210000 ( 1.214494)
T#clear 0.600000 0.000000 0.600000 ( 0.588611)
Q#size 0.370000 0.000000 0.370000 ( 0.365587)
T#size 0.350000 0.000000 0.350000 ( 0.356985)
Q#numwaiting 0.380000 0.000000 0.380000 ( 0.379199)
T#num
waiting 0.370000 0.000000 0.370000 ( 0.368075)
--------------------------------------- total: 11.300000sec

It has already been discussed on ruby-core (see ruby-core:31100).

This patch is one of the deliverables of my RubySoC project (slot #17): "Improving Ruby's Synchronization Primitives and Core Libraries" [2,3]

[1] http://github.com/panaggio/rubysoc-2010/blob/master/benchmarks/queue.rb
[2] http://pastebin.com/viSnfqe6
[3] http://rubysoc.org/projects
=end

final-queue.patch Magnifier - patch with proposed alterations: Queue, SizedQueue and ConditionVariable in C (18 KB) Ricardo Panaggio, 07/28/2010 02:51 AM

final_queue_without_mutex.diff Magnifier - final_queue patch without using mutex (relying on GVL) (16.7 KB) Yura Sokolov, 06/26/2012 04:18 PM

patch.diff Magnifier - bug fixed final_queue_without_mutex.diff (24.3 KB) Masaki Matsushita, 05/17/2013 10:59 PM

patch2.diff Magnifier - using rb_funcall2() (24.4 KB) Masaki Matsushita, 05/22/2013 09:07 PM

thread.c Magnifier (12.4 KB) Koichi Sasada, 08/29/2013 06:54 PM

patch3.diff Magnifier (23 KB) Masaki Matsushita, 09/05/2013 08:52 PM

Associated revisions

Revision 42862
Added by glass 7 months ago

  • common.mk: use RUNRUBY instead of MINIRUBY because MINIRUBY can't
    require extension libraries. The patch is from nobu
    (Nobuyoshi Nakada).

  • ext/thread/extconf.rb: for build ext/thread/thread.c.

  • include/ruby/intern.h: ditto.

  • thread.c: ditto.

  • lib/thread.rb: removed and replaced by ext/thread/thread.c.

  • ext/thread/thread.c: Queue, SizedQueue and ConditionVariable
    implementations in C. This patch is based on patches from panaggio
    (Ricardo Panaggio) and funny_falcon (Yura Sokolov) and ko1
    (Koichi Sasada). [Feature #3620]

  • test/thread/testqueue.rb (testqueuethreadraise): add a test for
    ensuring that killed thread should be removed from waiting threads.
    It is based on a code by ko1 (Koichi Sasada).

History

#1 Updated by Akira Tanaka almost 3 years ago

  • Project changed from Ruby to ruby-trunk
  • Category changed from ext to ext

#2 Updated by Eric Wong over 2 years ago

ping? The subject of Queue performance came up again in:
http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-talk/391324

#3 Updated by Eric Wong over 2 years ago

Issue #3620 has been updated by Eric Wong.

ping? The subject of Queue performance came up again in:
http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-talk/391324


Feature #3620: Add Queue, SIzedQueue and ConditionVariable implementations in C in addition to ruby ones
http://redmine.ruby-lang.org/issues/3620

Author: Ricardo Panaggio
Status: Open
Priority: Normal
Assignee:
Category: ext
Target version:

=begin
Queue, SizedQueue and ConditionVariable are important synchronization primitives and are nowadays implemented in Ruby.

Attached patch (initiated by myself and heavily enriched by Nobu) contains these sync primitives implemented in C, which makes them faster (see [1] for the benchmark's code):

Rehearsal -------------------------------------------------
Q#push 1.590000 0.010000 1.600000 ( 1.605502)
T#push 0.600000 0.010000 0.610000 ( 0.630444)
Q#pop 4.390000 0.000000 4.390000 ( 4.389781)
T#pop 0.580000 0.000000 0.580000 ( 0.578918)
Q#empty? 0.480000 0.000000 0.480000 ( 0.484305)
T#empty? 0.360000 0.000000 0.360000 ( 0.358559)
Q#clear 1.210000 0.000000 1.210000 ( 1.214494)
T#clear 0.600000 0.000000 0.600000 ( 0.588611)
Q#size 0.370000 0.000000 0.370000 ( 0.365587)
T#size 0.350000 0.000000 0.350000 ( 0.356985)
Q#numwaiting 0.380000 0.000000 0.380000 ( 0.379199)
T#num
waiting 0.370000 0.000000 0.370000 ( 0.368075)
--------------------------------------- total: 11.300000sec

It has already been discussed on ruby-core (see ruby-core:31100).

This patch is one of the deliverables of my RubySoC project (slot #17): "Improving Ruby's Synchronization Primitives and Core Libraries" [2,3]

[1] http://github.com/panaggio/rubysoc-2010/blob/master/benchmarks/queue.rb
[2] http://pastebin.com/viSnfqe6
[3] http://rubysoc.org/projects
=end

--
http://redmine.ruby-lang.org

#4 Updated by Hiroshi Nakamura about 2 years ago

  • Description updated (diff)
  • Assignee set to Koichi Sasada

#5 Updated by Shyouhei Urabe about 2 years ago

  • Status changed from Open to Assigned

#6 Updated by Yura Sokolov about 2 years ago

Related issue (but about plain ruby realization)
https://bugs.ruby-lang.org/issues/6174

#7 Updated by Koichi Sasada almost 2 years ago

Sorry for absent from this request.

Eric, can this patch work fine on current trunk?
... which patch should I use?

#8 Updated by Eric Wong almost 2 years ago

"ko1 (Koichi Sasada)" redmine@ruby-lang.org wrote:

Eric, can this patch work fine on current trunk?
... which patch should I use?

Yes, I just tested final-queue.patch with current trunk (r36217)
and it works.

The .gitignore hunk in the patch can be ignored, everything
else creates a new file and doesn't conflict.

This is Richard's patch, btw. I'm not sure if Richard still
pays attention to Ruby (haven't seen any activity from him in
a while).

push/pop are still significantly faster on current trunk with the
queue.rb benchmark:

                  user     system      total        real

Q#push 0.640000 0.000000 0.640000 ( 0.642801)
T#push 0.200000 0.000000 0.200000 ( 0.197539)
Q#pop 1.390000 0.000000 1.390000 ( 1.393139)
T#pop 0.230000 0.000000 0.230000 ( 0.228802)

(https://raw.github.com/panaggio/rubysoc-2010/master/benchmarks/queue.rb)

#9 Updated by Yura Sokolov almost 2 years ago

It seems that there is no need for mutex in a native queue implementation (considering we have GVL),
and so that queuesynchronized wrapper:
rb
mutexsleep could be replaces with rbthreadsleepforever without semantic change.

Without mutex, native queue becomes 2 times faster.
Modified patch is attached.

#10 Updated by Koichi Sasada almost 2 years ago

Hi,

(2012/06/26 16:18), funny_falcon (Yura Sokolov) wrote:

It seems that there is no need for mutex in a native queue implementation (considering we have GVL),
and so that queuesynchronized wrapper:
rb
mutexsleep could be replaces with rbthreadsleepforever without semantic change.

Without mutex, native queue becomes 2 times faster.
Modified patch is attached.

I found a bug.

### begin sample code
q = Thread::Queue.new
#q = Queue.new
th1 = Thread.new{
begin
p [:th1, q.pop]
rescue RuntimeError => e
sleep
p e
end
}
th2 = Thread.new{
sleep 0.1
p [:th2, q.pop]
}
p [th1, th2]
sleep 0.5
th1.raise "async interrupt!"
sleep 0.5
q << :s
# th1.join
p [th1, th2]
th2.join # BLOCK forever!
### end sample code

When th1 escapes from blocking by "pop" by exception, then waiting list
of Queue should be maintained (remove th1 from waiting list).

Other comment:

  • "extthread" is good name?

  • variable name

    static void
    wakeupallthreads(VALUE list)
    {
    VALUE thread, list0 = list;
    long i;

    list = rbarysubseq(list, 0, LONGMAX);
    rb
    aryclear(list0);
    for (i = 0; i < RARRAY
    LEN(list); ++i) {
    thread = RARRAYPTR(list)[i];
    rb
    threadwakeupalive(thread);
    }
    RBGCGUARD(list);
    }

    I prefer:

    static void
    wakeupallthreads(VALUE list0)
    {
    VALUE thread, list = rbarysubseq(list0, 0, LONG_MAX);
    long i;

    rbaryclear(list0);
    for (i = 0; i < RARRAYLEN(list); ++i) {
    thread = RARRAY
    PTR(list)[i];
    rbthreadwakeupalive(thread);
    }
    RB
    GC_GUARD(list);
    }

    But I prefer more:

    static void
    wakeupallthreads(VALUE list)
    {
    VALUE thread;
    long i;

    for (i = 0; i < RARRAYLEN(list); ++i) {
    thread = RARRAY
    PTR(list)[i];
    rbthreadwakeupalive(thread);
    }
    rb
    ary_clear(list);
    }

    Any reason to dup array before iteration?

  • TDATA -> TSTRUCT or T_OBJECT

    In this case, you don't need to use TDATA. You can only use TSTRUCT
    or T_OBJECT (with hidden attr). Maybe it will be simple.

    Thanks,
    Koichi


    Feature #3620: Add Queue, SIzedQueue and ConditionVariable implementations in C in addition to ruby ones
    https://bugs.ruby-lang.org/issues/3620#change-27471

    // SASADA Koichi at atdot dot net

#11 Updated by Koichi Sasada over 1 year ago

ping.

#12 Updated by Koichi Sasada over 1 year ago

  • Target version set to next minor

ping.
I think it can be introduce into Ruby 2.0 if there is a nice implementation.

#13 Updated by Masaki Matsushita 11 months ago

I fixed some bugs:
(1) blocking forever bug pointed out by ko1 in ruby-core:45950 SEGV in do_sleep()
(3) SizedQueue's bug which is similar to (1)

Now, the C implementation passes test-all.
Following diff is from finalqueuewithout_mutex.diff to my patch.

diff --git a/ext/thread/thread.c b/ext/thread/thread.c
index 9aff5e5..33365c1 100644
--- a/ext/thread/thread.c
+++ b/ext/thread/thread.c
@@ -150,7 +150,7 @@ static VALUE
dosleep(VALUE args)
{
struct sleep
call p = (struct sleepcall *)args;
- return rb
funcall(p->argv[0], rbintern("sleep"), p->argc-1, p->argv+1);
+ return rb
funcall(p->argv[0], rb_intern("sleep"), p->argc-1, p->argv[1]); /
(2) */
}

static VALUE
@@ -339,15 +339,31 @@ rbqueuepush(VALUE self, VALUE obj)
return self;
}

+struct waitingdelete {
+ VALUE waiting;
+ VALUE th;
+};
+
+static VALUE
+queue
deletefromwaiting(struct waitingdelete *p)
+{
+ rb
arydelete(p->waiting, p->th);
+ return Qnil;
+}
+
static VALUE
queue
dopop(Queue *queue, VALUE shouldblock)
{
- while (!RARRAYLEN(queue->que)) {
+ struct waiting
delete args;
+
+ while (RARRAYLEN(queue->que) == 0) {
if (!(int)should
block) {
rbraise(rbeThreadError, "queue empty");
}
- rbarypush(queue->waiting, rbthreadcurrent());
- rbthreadsleepforever();
+ args.waiting = queue->waiting;
+ args.th = rb
threadcurrent();
+ rb
arypush(args.waiting, args.th);
+ rb
ensure((VALUE ()())rbthreadsleepforever, (VALUE)0, queuedeletefromwaiting, (VALUE)&args); / (1) */
}

 return rb_ary_shift(queue->que);

@@ -590,9 +606,14 @@ rbszqueuemaxset(VALUE self, VALUE vmax)
static VALUE
szqueue
dopush(SizedQueue *szqueue, VALUE obj)
{
+ struct waiting
delete args;
+ VALUE thread;
+
while (queuelength(&szqueue->queue) >= szqueue->max) {
- rbarypush(szqueue->queuewait, rbthreadcurrent());
- rb
threadsleepforever();
+ args.waiting = szqueue->queuewait;
+ args.th = rb
threadcurrent();
+ rb
arypush(args.waiting, args.th);
+ rb
ensure((VALUE ()())rbthreadsleepforever, (VALUE)0, queuedeletefromwaiting, (VALUE)&args); / (3) */
}
return queuedopush(&szqueue->queue_, obj);
}

#14 Updated by Masaki Matsushita 11 months ago

Here is benchmark results.
I used the code same as .
( https://raw.github.com/panaggio/rubysoc-2010/master/benchmarks/queue.rb )

trunk (r40799):
user system total real
Q#push 3.290000 0.000000 3.290000 ( 3.292121)
Q#pop 1.900000 0.000000 1.900000 ( 1.898899)
Q#empty? 0.120000 0.000000 0.120000 ( 0.122000)
Q#clear 1.700000 0.010000 1.710000 ( 1.706695)
Q#size 0.130000 0.000000 0.130000 ( 0.128248)
Q#num_waiting 0.120000 0.000000 0.120000 ( 0.117081)

proposed:
user system total real
T#push 0.130000 0.000000 0.130000 ( 0.138489)
T#pop 0.120000 0.000000 0.120000 ( 0.119966)
T#empty? 0.100000 0.000000 0.100000 ( 0.104266)
T#clear 0.130000 0.000000 0.130000 ( 0.135279)
T#size 0.110000 0.000000 0.110000 ( 0.104839)
T#num_waiting 0.100000 0.000000 0.100000 ( 0.097017)

#15 Updated by Nobuyoshi Nakada 11 months ago

=begin
You can use (({rbfuncall2()})) instead here.
- return rb
funcall(p->argv[0], rbintern("sleep"), p->argc-1, p->argv+1);
+ return rb
funcall(p->argv[0], rb_intern("sleep"), p->argc-1, p->argv[1]); /* (2) */
=end

#16 Updated by Masaki Matsushita 11 months ago

2013/5/18 nobu (Nobuyoshi Nakada) nobu@ruby-lang.org

You can use rb_funcall2() instead here.

I made a patch which use rb_funcall2().

diff from patch.diff to patch2.diff:

diff --git a/ext/thread/thread.c b/ext/thread/thread.c
index 33365c1..f2d17e9 100644
--- a/ext/thread/thread.c
+++ b/ext/thread/thread.c
@@ -142,15 +142,17 @@ rbcondvarinitialize(VALUE self)
}

struct sleep_call {
- int argc;
- VALUE *argv;
+ VALUE mutex;
+ VALUE timeout;
};

+static ID idsleep;
+
static VALUE
do
sleep(VALUE args)
{
struct sleepcall *p = (struct sleepcall *)args;
- return rbfuncall(p->argv[0], rbintern("sleep"), p->argc-1, p->argv[1]);
+ return rbfuncall2(p->mutex, idsleep, 1, &p->timeout);
}

static VALUE
@@ -173,12 +175,16 @@ static VALUE
rbcondvarwait(int argc, VALUE *argv, VALUE self)
{
VALUE waiters = getcondvarptr(self)->waiters;
+ VALUE mutex, timeout;
struct sleep_call args;

  • args.argc = argc;
  • args.argv = argv;
  • rbscanargs(argc, argv, "11", &mutex, &timeout); +
  • args.mutex = mutex;
  • args.timeout = timeout; rbarypush(waiters, rbthreadcurrent()); rbensure(dosleep, (VALUE)&args, deletecurrentthread, waiters); + return self; }

@@ -607,7 +613,6 @@ static VALUE
szqueuedopush(SizedQueue *szqueue, VALUE obj)
{
struct waiting_delete args;
- VALUE thread;

 while (queue_length(&szqueue->queue_) >= szqueue->max) {
    args.waiting = szqueue->queue_wait;

@@ -699,6 +704,8 @@ Initthread(void)
VALUE rb
cQueue = DEFINECLASSUNDERTHREAD(Queue, rbcObject);
VALUE rbcSizedQueue = DEFINECLASSUNDERTHREAD(SizedQueue, rb_cQueue);

  • idsleep = rbintern("sleep"); + rbdefineallocfunc(rbcConditionVariable, condvaralloc); rbdefinemethod(rbcConditionVariable, "initialize", rbcondvarinitialize, 0); rbdefinemethod(rbcConditionVariable, "wait", rbcondvar_wait, -1);

#17 Updated by Koichi Sasada 8 months ago

I rewrite to use TSTRUCT instead of TDATA (attached).

Matsushita-san:
Could you check it?

Issue:
With ext/thraed/thread.c instead of lib/thread.rb, we can't install ruby itself.

(1) rbinstall.rb is invoked with miniruby
(2) rbinstall.rb requires rubygems.rb
(3) rubygems.rb requires lib/rubygems/coreext/kernelrequire.rb
(4) lib/rubygems/coreext/kernelrequire.rb requires lib/monitor.rb
(5) lib/monitor.rb requires thread.rb
Problems are
(a) miniruby can't require extension libraries
(b) rbinstall.rb is not invoked with -I option

Easy way to solve these problem is use rbinstall.rb by ./ruby instead of ./miniruby with -I option.

Another solution is to embed thread.rb features (CV and Queue) as embeded classes.

#18 Updated by Eric Wong 8 months ago

"ko1 (Koichi Sasada)" redmine@ruby-lang.org wrote:

Another solution is to embed thread.rb features (CV and Queue) as
embeded classes.

I prefer this with no extra .so. Too many .so files hurts load time.
Just leave an empty thread.rb for compatibility.

#19 Updated by Koichi Sasada 8 months ago

(2013/08/30 12:14), SASADA Koichi wrote:

A patch for this approach:
http://www.atdot.net/sp/raw/m5xbsm

Note that last patch does not care about deadlock detection.

--
// SASADA Koichi at atdot dot net

#20 Updated by Nobuyoshi Nakada 8 months ago

(13/08/29 18:54), ko1 (Koichi Sasada) wrote:

Problems are
(a) miniruby can't require extension libraries
(b) rbinstall.rb is not invoked with -I option

Easy way to solve these problem is use rbinstall.rb by ./ruby instead of ./miniruby with -I option.

Use $(RUNRUBY) in $(INSTRUBY) instead of $(MINIRUBY).

#21 Updated by Nobuyoshi Nakada 8 months ago

(13/08/30 15:26), Nobuyoshi Nakada wrote:

(13/08/29 18:54), ko1 (Koichi Sasada) wrote:

Problems are
(a) miniruby can't require extension libraries
(b) rbinstall.rb is not invoked with -I option

Easy way to solve these problem is use rbinstall.rb by ./ruby instead of ./miniruby with -I option.

Use $(RUNRUBY) in $(INSTRUBY) instead of $(MINIRUBY).

Since tool/rbinstall.rb replaces $:, it doesn't work as-is.

diff --git a/common.mk b/common.mk
index f295fb5..64ff5cc 100644
--- a/common.mk
+++ b/common.mk
@@ -123,7 +123,7 @@ SCRIPTARGS = --dest-dir="$(DESTDIR)" \
--make-flags="$(MAKEFLAGS)"
EXTMK
ARGS = $(SCRIPTARGS) --extension $(EXTS) --extstatic $(EXTSTATIC) \
--make-flags="V=$(V) MINIRUBY='$(MINIRUBY)'" --
-INSTRUBY = $(SUDO) $(MINIRUBY) $(srcdir)/tool/rbinstall.rb
+INSTRUBY = $(SUDO) $(RUNRUBY) -r./$(arch)-fake $(srcdir)/tool/rbinstall.rb
INSTRUBY
ARGS = $(SCRIPTARGS) \
--data-mode=$(INSTALL
DATAMODE) \
--prog-mode=$(INSTALL
PROG_MODE) \
@@ -449,7 +449,7 @@ post-no-install-doc::

CLEARINSTALLEDLIST = clear-installed-list

-install-prereq: $(CLEARINSTALLEDLIST) PHONY
+install-prereq: $(CLEARINSTALLEDLIST) yes-fake PHONY

clear-installed-list: PHONY
@> $(INSTALLED_LIST) set MAKE="$(MAKE)"

--
Nobu Nakada

#22 Updated by Masaki Matsushita 7 months ago

Sorry for my late response.
I fixed a bug in ext/thread.c ().
It was not compatible with Objects extended by Mutexm and test/testmutexm.rb failed.
Moreover, I use rb
threadsleepdeadly() to make it get along with deadlock detection.

Attached patch includes:
* updated ext/thread.c
* added test from to test/thread/test_queue.rb
* common.mk rewrited by Nakada-san

#23 Updated by Koichi Sasada 7 months ago

(2013/09/05 20:52), Glass_saga (Masaki Matsushita) wrote:

I fixed a bug in ext/thread.c ().
It was not compatible with Objects extended by Mutexm and test/testmutexm.rb failed.
Moreover, I use rb
threadsleepdeadly() to make it get along with deadlock detection.

Attached patch includes:
* updated ext/thread.c
* added test from to test/thread/test_queue.rb
* common.mk rewrited by Nakada-san

Go ahead.

I'll make another ticket to embed Queue in srcdir/thread.c.

--
// SASADA Koichi at atdot dot net

#24 Updated by Anonymous 7 months ago

  • Status changed from Assigned to Closed
  • % Done changed from 0 to 100

This issue was solved with changeset r42862.
Ricardo, thank you for reporting this issue.
Your contribution to Ruby is greatly appreciated.
May Ruby be with you.


  • common.mk: use RUNRUBY instead of MINIRUBY because MINIRUBY can't
    require extension libraries. The patch is from nobu
    (Nobuyoshi Nakada).

  • ext/thread/extconf.rb: for build ext/thread/thread.c.

  • include/ruby/intern.h: ditto.

  • thread.c: ditto.

  • lib/thread.rb: removed and replaced by ext/thread/thread.c.

  • ext/thread/thread.c: Queue, SizedQueue and ConditionVariable
    implementations in C. This patch is based on patches from panaggio
    (Ricardo Panaggio) and funny_falcon (Yura Sokolov) and ko1
    (Koichi Sasada). [Feature #3620]

  • test/thread/testqueue.rb (testqueuethreadraise): add a test for
    ensuring that killed thread should be removed from waiting threads.
    It is based on a code by ko1 (Koichi Sasada).

#25 Updated by Motohiro KOSAKI 7 months ago

(9/5/13 8:55 AM), SASADA Koichi wrote:

(2013/09/05 20:52), Glass_saga (Masaki Matsushita) wrote:

I fixed a bug in ext/thread.c ().
It was not compatible with Objects extended by Mutexm and test/testmutexm.rb failed.
Moreover, I use rb
threadsleepdeadly() to make it get along with deadlock detection.

Attached patch includes:
* updated ext/thread.c
* added test from to test/thread/test_queue.rb
* common.mk rewrited by Nakada-san

Go ahead.

I'll make another ticket to embed Queue in srcdir/thread.c.

We already have.

http://bugs.ruby-lang.org/issues/7923

#26 Updated by Koichi Sasada 7 months ago

(2013/09/13 3:03), KOSAKI Motohiro wrote:

We already have.

http://bugs.ruby-lang.org/issues/7923

I feel "making trap safe Queue (Feature #7923)" and embedding Queue
class is not same.

--
// SASADA Koichi at atdot dot net

Also available in: Atom PDF