Feature #9121
closed[PATCH] Remove rbtree implementation of SortedSet due to performance regression
Description
rbtree is slower than the pure ruby version.
I have provided benchmarks and a patch here:
https://github.com/ruby/ruby/pull/451
ruby sorted_set_benchmark.rb
using rbtree
user system total real
#add 0.010000 0.000000 0.010000 ( 0.016446)
#delete 0.020000 0.000000 0.020000 ( 0.013248)
#include? 1000 items 0.010000 0.000000 0.010000 ( 0.011822)
#include? 2000 items 0.020000 0.000000 0.020000 ( 0.012572)
#include? 3000 items 0.020000 0.000000 0.020000 ( 0.013610)
#include? 4000 items 0.020000 0.000000 0.020000 ( 0.014295)
#include? 5000 items 0.010000 0.000000 0.010000 ( 0.018024)
#to_a 1000 items 0.580000 0.020000 0.600000 ( 0.616104)
#to_a 2000 items 1.170000 0.040000 1.210000 ( 1.213406)
#to_a 3000 items 1.730000 0.030000 1.760000 ( 1.773069)
#to_a 4000 items 2.370000 0.040000 2.410000 ( 2.420450)
#to_a 5000 items 2.920000 0.050000 2.970000 ( 2.975497)
ruby sorted_set_benchmark.rb
NOT using rbtree
user system total real
#add 0.010000 0.000000 0.010000 ( 0.007889)
#delete 0.010000 0.000000 0.010000 ( 0.004631)
#include? 1000 items 0.000000 0.000000 0.000000 ( 0.005060)
#include? 2000 items 0.010000 0.000000 0.010000 ( 0.005950)
#include? 3000 items 0.010000 0.000000 0.010000 ( 0.005814)
#include? 4000 items 0.010000 0.000000 0.010000 ( 0.005993)
#include? 5000 items 0.010000 0.000000 0.010000 ( 0.006923)
#to_a 1000 items 0.000000 0.000000 0.000000 ( 0.001863)
#to_a 2000 items 0.000000 0.000000 0.000000 ( 0.002145)
#to_a 3000 items 0.000000 0.000000 0.000000 ( 0.002129)
#to_a 4000 items 0.000000 0.000000 0.000000 ( 0.002265)
#to_a 5000 items 0.000000 0.000000 0.000000 ( 0.002428)
Updated by zzak (zzak _) about 11 years ago
- Category set to lib
- Status changed from Open to Assigned
- Assignee set to knu (Akinori MUSHA)
Updated by vjoel (Joel VanderWerf) about 11 years ago
As noted at https://github.com/ruby/ruby/pull/451#issuecomment-28741490:
These benchmarks miss the point of using rbtree, which is to pay a small insertion cost to keep the structure sorted so that ordered lookups are fast. If you only need to perform lookups after the structure is built, then rbtree is a waste of time.
See https://gist.github.com/vjoel/7535917.
Also, this part of the benchmark is particularly misleading:
b.report("#to_a #{n} items") {
10000.times { s.to_a }
}
because the non-rbtree implementation caches the #to_a output and will never drop that cache inside of the above loop.
Updated by knu (Akinori MUSHA) about 11 years ago
Thanks for your input, guys.
I think I'll drop the optional rbtree version of SortedSet for now, since rbtree is seemingly broken for the latest version of ruby.
Updated by mame (Yusuke Endoh) about 11 years ago
knu (Akinori MUSHA) wrote:
rbtree is seemingly broken for the latest version of ruby.
What do you mean? What broke rbtree?
I'm afraid it is a more serious problem than this ticket itself.
--
Yusuke Endoh mame@tsg.ne.jp
Updated by zzak (zzak _) about 11 years ago
See #7698 and https://github.com/seki/Drip/issues/4
On Nov 23, 2013, at 6:09 PM, "mame (Yusuke Endoh)" mame@tsg.ne.jp wrote:
Issue #9121 has been updated by mame (Yusuke Endoh).
knu (Akinori MUSHA) wrote:
rbtree is seemingly broken for the latest version of ruby.
What do you mean? What broke rbtree?
I'm afraid it is a more serious problem than this ticket itself.--
Yusuke Endoh mame@tsg.ne.jpBug #9121: [PATCH] Remove rbtree implementation of SortedSet due to performance regression
https://bugs.ruby-lang.org/issues/9121#change-43101Author: xshay (Xavier Shay)
Status: Assigned
Priority: Normal
Assignee: knu (Akinori MUSHA)
Category: lib
Target version:
ruby -v: 2.0.0-p247
Backport: 1.9.3: UNKNOWN, 2.0.0: UNKNOWNrbtree is slower than the pure ruby version.
I have provided benchmarks and a patch here:
https://github.com/ruby/ruby/pull/451ruby sorted_set_benchmark.rb
using rbtree
user system total real
#add 0.010000 0.000000 0.010000 ( 0.016446)
#delete 0.020000 0.000000 0.020000 ( 0.013248)
#include? 1000 items 0.010000 0.000000 0.010000 ( 0.011822)
#include? 2000 items 0.020000 0.000000 0.020000 ( 0.012572)
#include? 3000 items 0.020000 0.000000 0.020000 ( 0.013610)
#include? 4000 items 0.020000 0.000000 0.020000 ( 0.014295)
#include? 5000 items 0.010000 0.000000 0.010000 ( 0.018024)
#to_a 1000 items 0.580000 0.020000 0.600000 ( 0.616104)
#to_a 2000 items 1.170000 0.040000 1.210000 ( 1.213406)
#to_a 3000 items 1.730000 0.030000 1.760000 ( 1.773069)
#to_a 4000 items 2.370000 0.040000 2.410000 ( 2.420450)
#to_a 5000 items 2.920000 0.050000 2.970000 ( 2.975497)
ruby sorted_set_benchmark.rb
NOT using rbtree
user system total real
#add 0.010000 0.000000 0.010000 ( 0.007889)
#delete 0.010000 0.000000 0.010000 ( 0.004631)
#include? 1000 items 0.000000 0.000000 0.000000 ( 0.005060)
#include? 2000 items 0.010000 0.000000 0.010000 ( 0.005950)
#include? 3000 items 0.010000 0.000000 0.010000 ( 0.005814)
#include? 4000 items 0.010000 0.000000 0.010000 ( 0.005993)
#include? 5000 items 0.010000 0.000000 0.010000 ( 0.006923)
#to_a 1000 items 0.000000 0.000000 0.000000 ( 0.001863)
#to_a 2000 items 0.000000 0.000000 0.000000 ( 0.002145)
#to_a 3000 items 0.000000 0.000000 0.000000 ( 0.002129)
#to_a 4000 items 0.000000 0.000000 0.000000 ( 0.002265)
#to_a 5000 items 0.000000 0.000000 0.000000 ( 0.002428)
Updated by knu (Akinori MUSHA) about 11 years ago
mame (Yusuke Endoh) wrote:
knu (Akinori MUSHA) wrote:
rbtree is seemingly broken for the latest version of ruby.
What do you mean? What broke rbtree?
Try it yourself and you'll see. It relies on the internal data structure of RHash at some point of Ruby that lasted until 1.9.3.
I'm afraid it is a more serious problem than this ticket itself.
It is obviously a third-party issue and the point of this issue is that we should not depend on it any more, so I'll go ahead anyway.
Updated by knu (Akinori MUSHA) about 11 years ago
I wrote:
... and the point of this issue is that we should not depend on it any more, so I'll go ahead anyway.
Oops, I mistook this issue for something else, I just thought that's another reason to ditch the rbtree dependency albeit optional.
Updated by mame (Yusuke Endoh) about 11 years ago
zzak (Zachary Scott) wrote:
Thank you, but it works now for me.
$ ruby -v
ruby 2.0.0p353 (2013-11-22 revision 43784) [x86_64-linux]
$ gem -v
2.1.11
$ gem install rbtree
Building native extensions. This could take a while...
Successfully installed rbtree-0.4.1
Parsing documentation for rbtree-0.4.1
Done installing documentation for rbtree after 0 seconds
1 gem installed
knu (Akinori MUSHA) wrote:
What do you mean? What broke rbtree?
Try it yourself and you'll see. It relies on the internal data structure of RHash at some point of Ruby that lasted until 1.9.3.
I'm using rbtree on 2.0.0 everyday, but I have never encountered any problem.
Where was the issue discussed? Could you show me a pointer?
I'm afraid it is a more serious problem than this ticket itself.
It is obviously a third-party issue and the point of this issue is that we should not depend on it any more, so I'll go ahead anyway.
In principle, I agree that it is not cool for a standard library to depend
a third-party library. But in fact, (a part of) set.rb has depended on rbtree.
As Joel said, some operations will become very slow if it is removed.
Isn't it a compatibility problem?
--
Yusuke Endoh mame@tsg.ne.jp
Updated by knu (Akinori MUSHA) about 11 years ago
- Target version set to 3.0
Maybe. And I noticed the second preview (not RC though) of 2.1.0 was out, so I'll postpone any change to SortedSet to the next major.
Updated by tmm1 (Aman Karmani) about 11 years ago
The following patch fixes rbtree on ruby 2.1: https://gist.github.com/tmm1/7609371
I emailed it to burningdowntheopera@yahoo.co.jp 8 days ago, but no response.
Updated by zzak (zzak _) about 11 years ago
Thank you tmm1!
There is also rbtree2 (https://github.com/kitak/rbtree2), that is also seemingly unmaintained, but perhaps they will be more willing to do a release. Ofcourse, it would be best if we could get OZAWA-san to respond to the patch and release.
Updated by knu (Akinori MUSHA) almost 11 years ago
On second thought, if a working version rbtree is available out there (at least for CRuby and hopefully for JRuby) then I would like SortedSet to be moved from the standard library to a third-party gem.
The non-rbtree implementation of SortedSet should be seen as an ad hoc substitute, not a professional, production quality implementation to provide the performance characteristics that would be naturally expected from the name.
Updated by zzak (zzak _) almost 11 years ago
I just wanted to remind everyone of the discussion that took place regarding adding RBTree to stdlib in [Feature #2348]
Updated by hsbt (Hiroshi SHIBATA) over 10 years ago
- Related to Feature #2348: RBTree Should be Added to the Standard Library added
Updated by naruse (Yui NARUSE) about 4 years ago
- Tracker changed from Bug to Feature
- Target version deleted (
3.0) - ruby -v deleted (
2.0.0-p247) - Backport deleted (
1.9.3: UNKNOWN, 2.0.0: UNKNOWN)
Updated by jeremyevans0 (Jeremy Evans) about 3 years ago
- Status changed from Assigned to Closed
SortedSet was removed in a3db08d7b6ff119223f77e3df00b4f6deac971e2