Project

General

Profile

Actions

Feature #21821

open

Add Stack and SizedStack classes

Feature #21821: Add Stack and SizedStack classes

Added by byroot (Jean Boussier) about 2 months ago. Updated about 1 month ago.

Status:
Open
Assignee:
-
Target version:
[ruby-core:<unknown>]

Description

Context

Queue and SizedQueue are very useful and performant constructs, however they only allow for FIFO queues.
Some use cases do call for LIFO queues AKA stacks.

For instance, in the case of connection pool, it's often preferable to use a stack.

Feature

I'd like to introduce Stack and SizedStack classes, to mirror Queue and SizedQueue.
They'd have exactly the same method and behavior at the exception of dequeuing order.

Thread namespace?

Currently Queue and SizedQueue are technically defined under Thread and aliased in Object.
I wonder if that's something we should do for Stack too, or just some historical thing we shouldn't perpetuate.


Related issues 1 (0 open1 closed)

Related to Ruby - Feature #21721: Allow `Queue` and `SizedQueue` to be used as LIFO queuesRejectedActions

Updated by byroot (Jean Boussier) about 2 months ago Actions #1

  • Related to Feature #21721: Allow `Queue` and `SizedQueue` to be used as LIFO queues added

Updated by nobu (Nobuyoshi Nakada) about 2 months ago Actions #2

byroot (Jean Boussier) wrote:

Currently Queue and SizedQueue are technically defined under Thread and aliased in Object.
I wonder if that's something we should do for Stack too, or just some historical thing we shouldn't perpetuate.

I think it will depend on what you achieve with Stack.
If you want to use it for inter thread communication, Thread namespace should be suitable.
Otherwise, if you want just simple push/pop operations, why isn't Array enough?

Updated by byroot (Jean Boussier) about 2 months ago Actions #3

I think it will depend on what you achieve with Stack.

Yes, the point is to use it in cases where synchronization is needed, hence Array is not suitable. Typically connection pools: https://github.com/mperham/connection_pool/blob/8ba2715edca69d6162fd4f798997b74dd64e9529/lib/connection_pool/timed_stack.rb

Updated by ko1 (Koichi Sasada) about 1 month ago Actions #4 [ruby-core:124506]

I heard that C++ has deque: Double-Ended Queue https://en.cppreference.com/w/cpp/container/deque.html

  • The sound of Deque is similar to dequeue or deq (operation names). so introducing Dequeue seems not good idea.
  • If we assume the implementation of Queue is based on Dequeu, adding pop_back() (LIFO operation) seems reasonalbe, and easy to introduce than introducing new classes, and good aligned to Ruby's large-class philosophy. I want to ask Matz again.

I'm afraid that expanding the similar idea like Thread::Heap, Thread::PriorityQueue and so on, atomic containers for each data structure.

Updated by Eregon (Benoit Daloze) about 1 month ago · Edited Actions #5 [ruby-core:124507]

ko1 (Koichi Sasada) wrote in #note-4:

  • If we assume the implementation of Queue is based on Dequeu, adding pop_back() (LIFO operation) seems reasonalbe, and easy to introduce than introducing new classes, and good aligned to Ruby's large-class philosophy. I want to ask Matz again.

This would make the implementation of Queue (based on Deque or not) more complicated and less efficient, as I said in the last paragraph of https://bugs.ruby-lang.org/issues/21721#note-2.

I think we should either:

I'm afraid that expanding the similar idea like Thread::Heap, Thread::PriorityQueue and so on, atomic containers for each data structure.

Yes, it starts to feel like Java. I think there are not enough use cases for concurrent stacks to deserve to be core.
It could be in concurrent-ruby perhaps, or maybe the 2 alternatives linked above are good enough.

Updated by Eregon (Benoit Daloze) about 1 month ago Actions #6 [ruby-core:124508]

From a quick look the closest thing to a concurrent Stack in Java is LinkedBlockingDeque.
That implementation uses a single lock and condition variables, so it's not particularly efficient and likely causes significant contention.
In comparison LinkedBlockingQueue uses two locks (basically one for push, one for pop) to avoid contention, unless the queue is completely empty and then both locks need to be taken.
That illustrates my point that having a Deque limits the implementation choices and optimizations significantly.

Updated by Eregon (Benoit Daloze) about 1 month ago Actions #7 [ruby-core:124509]

BTW #num_waiting is always tricky to implement so if we add new classes I would like to not have that, because it's always messy to implement and makes it e.g. impossible to reuse JDK classes for JRuby & TruffleRuby.

Updated by byroot (Jean Boussier) about 1 month ago Actions #8 [ruby-core:124511]

there are 2 reasonable alternatives

About the second one, it's clever but not sure if ideal because it's no longer "atomic" on MRI, so is subject to Thread#raise. Looking at the connection_pool gem, one thing that slows it down considerably is the numerous calls to Thread.handle_interrupt to avoid leaking connections when Timeout is used.

Updated by Eregon (Benoit Daloze) about 1 month ago · Edited Actions #9 [ruby-core:124512]

byroot (Jean Boussier) wrote in #note-8:

About the second one, it's clever but not sure if ideal because it's no longer "atomic" on MRI, so is subject to Thread#raise. Looking at the connection_pool gem, one thing that slows it down considerably is the numerous calls to Thread.handle_interrupt to avoid leaking connections when Timeout is used.

I see https://github.com/mperham/connection_pool/blob/f3645821d02fe8de5089f31b61600a1f6217afb1/lib/connection_pool.rb#L64-L73
There is lots of code in checkin/checkout anyway, those Thread.handle_interrupt can't be removed even if SizedStack exists.
Even if it's a single method call directly to a SizedStack method, that could still check interrupts, and it definitely would for pop, so you'd still need the Thread.handle_interrupt.
On that topic maybe we should have all ensure automatically be Thread.handle_interrupt(Object => :never) (https://bugs.ruby-lang.org/issues/17849#note-5), or a way to opt-in per ensure. Though it wouldn't cover checkout then, which I'm unsure if OK or not. Maybe a way to mark a method as uninterruptible then for efficiency (but obviously problematic if misused).

BTW it doesn't seem great to have checkout in Thread.handle_interrupt(Object => :never) because then that thread can't be interrupted, but checkout might take a while because it does a pop with a 5 seconds default timeout. So probably checkout should not be under handle_interrupt(Object => :never) in the first place. The tricky part is once the connection is returned by pop it needs to become uninterruptible, but there is no way to do that with the Thread.handle_interrupt API. So probably a smaller timeout is the best workaround here.

Also ConnectionPool::TimedStack has a bunch of code which doesn't seem to me like it could be replaced or implemented with a SizedStack (but I haven't tried).
It reminds me when I once tried to replace the queue in Puma's thread pool with a SizedQueue to remove the extra ConditionVariable & Mutex, but it's not possible, it does things that SizedQueue can't do.

IOW, I think we should have performance numbers here and implemented use cases that pass the relevant test suites, otherwise we might add core class(es) that actually can't be used for the intended use cases.

Updated by byroot (Jean Boussier) about 1 month ago Actions #10 [ruby-core:124513]

IOW, I think we should have performance numbers here and implemented use cases that pass the relevant test suites

Yes that's fair. I meant to re-implement the redis-client pool using Thread::Queue as a proof of concept, but didn't get around to it yet.

Updated by p8 (Petrik de Heus) about 1 month ago · Edited Actions #11 [ruby-core:124516]

Eregon (Benoit Daloze) wrote in #note-9:

It reminds me when I once tried to replace the queue in Puma's thread pool with a SizedQueue to remove the extra ConditionVariable & Mutex, but it's not possible, it does things that SizedQueue can't do.

The ConditionVariable in Puma 7's thread pool is causing issues in JRuby: https://github.com/jruby/jruby/issues/9140
So I thought about attempting to rewrite it without the ConditionVariable (but that is way out of my expertise and now I'm learning not possible with a SizedQueue).

Sequel's timed_queue avoids the ConditionVariable with big improvements for some, but I'm not sure that can be applied to Puma.

Updated by mame (Yusuke Endoh) about 1 month ago Actions #12 [ruby-core:124535]

Discussed at the dev meeting. No conclusion has been reached, but some opinions raised included:

  • Instead of introducing separate classes for each algorithm, how about introducing a generic container class under a more use-case-driven name like Thread::Pool? Thread::Pool.new(:fifo) or Thread::Pool.new(:lifo) could determine how to retrieve items, for example. (note: @matz (Yukihiro Matsumoto) didn't say his opinion about this, as I recall)
  • If we only need to add a method to retrieve items LIFO from Queue, returning to #21721, how about a method name that is clearly not orthodox, like Queue#unpush? @matz (Yukihiro Matsumoto) said he can compromise if a better name could be found.

Updated by Eregon (Benoit Daloze) about 1 month ago Actions #13 [ruby-core:124548]

p8 (Petrik de Heus) wrote in #note-11:

The ConditionVariable in Puma 7's thread pool is causing issues in JRuby: https://github.com/jruby/jruby/issues/9140

I recall it was also causing issues in TruffleRuby back then hence my short attempt to remove it.
But we fixed ConditionVariable since then and it seems to work fine now (though indeed there is the overhead of creating a InterruptedException, but the whole Java blocking/interruption is basically based on those so seems hard to fix and more of a JVM bug IMO).
ConditionVariable are a very useful concurrency primitive, so I don't think it's productive in general to try to avoid them, better to optimize them since they are used in plenty of places.

So I thought about attempting to rewrite it without the ConditionVariable (but that is way out of my expertise and now I'm learning not possible with a SizedQueue).

Not possible is probably too strong, what I meant to say is when I looked at it and tried it there are some methods I couldn't implement with a SizedQueue, so the API back then and I think the API now has methods which are not implementable with a SizedQueue.

Actions

Also available in: PDF Atom