Project

General

Profile

Actions

Feature #9121

closed

[PATCH] Remove rbtree implementation of SortedSet due to performance regression

Added by xshay (Xavier Shay) over 10 years ago. Updated over 2 years ago.

Status:
Closed
Target version:
-
[ruby-core:58396]

Description

rbtree is slower than the pure ruby version.

I have provided benchmarks and a patch here:
https://github.com/ruby/ruby/pull/451

ruby sorted_set_benchmark.rb
using rbtree
user system total real
#add 0.010000 0.000000 0.010000 ( 0.016446)
#delete 0.020000 0.000000 0.020000 ( 0.013248)
#include? 1000 items 0.010000 0.000000 0.010000 ( 0.011822)
#include? 2000 items 0.020000 0.000000 0.020000 ( 0.012572)
#include? 3000 items 0.020000 0.000000 0.020000 ( 0.013610)
#include? 4000 items 0.020000 0.000000 0.020000 ( 0.014295)
#include? 5000 items 0.010000 0.000000 0.010000 ( 0.018024)
#to_a 1000 items 0.580000 0.020000 0.600000 ( 0.616104)
#to_a 2000 items 1.170000 0.040000 1.210000 ( 1.213406)
#to_a 3000 items 1.730000 0.030000 1.760000 ( 1.773069)
#to_a 4000 items 2.370000 0.040000 2.410000 ( 2.420450)
#to_a 5000 items 2.920000 0.050000 2.970000 ( 2.975497)
ruby sorted_set_benchmark.rb
NOT using rbtree
user system total real
#add 0.010000 0.000000 0.010000 ( 0.007889)
#delete 0.010000 0.000000 0.010000 ( 0.004631)
#include? 1000 items 0.000000 0.000000 0.000000 ( 0.005060)
#include? 2000 items 0.010000 0.000000 0.010000 ( 0.005950)
#include? 3000 items 0.010000 0.000000 0.010000 ( 0.005814)
#include? 4000 items 0.010000 0.000000 0.010000 ( 0.005993)
#include? 5000 items 0.010000 0.000000 0.010000 ( 0.006923)
#to_a 1000 items 0.000000 0.000000 0.000000 ( 0.001863)
#to_a 2000 items 0.000000 0.000000 0.000000 ( 0.002145)
#to_a 3000 items 0.000000 0.000000 0.000000 ( 0.002129)
#to_a 4000 items 0.000000 0.000000 0.000000 ( 0.002265)
#to_a 5000 items 0.000000 0.000000 0.000000 ( 0.002428)


Related issues 1 (0 open1 closed)

Related to Ruby master - Feature #2348: RBTree Should be Added to the Standard LibraryRejectedmatz (Yukihiro Matsumoto)Actions

Updated by zzak (zzak _) over 10 years ago

  • Category set to lib
  • Status changed from Open to Assigned
  • Assignee set to knu (Akinori MUSHA)

Updated by vjoel (Joel VanderWerf) over 10 years ago

As noted at https://github.com/ruby/ruby/pull/451#issuecomment-28741490:

These benchmarks miss the point of using rbtree, which is to pay a small insertion cost to keep the structure sorted so that ordered lookups are fast. If you only need to perform lookups after the structure is built, then rbtree is a waste of time.

See https://gist.github.com/vjoel/7535917.

Also, this part of the benchmark is particularly misleading:

b.report("#to_a #{n} items") {
10000.times { s.to_a }
}

because the non-rbtree implementation caches the #to_a output and will never drop that cache inside of the above loop.

Updated by knu (Akinori MUSHA) over 10 years ago

Thanks for your input, guys.

I think I'll drop the optional rbtree version of SortedSet for now, since rbtree is seemingly broken for the latest version of ruby.

Updated by mame (Yusuke Endoh) over 10 years ago

knu (Akinori MUSHA) wrote:

rbtree is seemingly broken for the latest version of ruby.

What do you mean? What broke rbtree?
I'm afraid it is a more serious problem than this ticket itself.

--
Yusuke Endoh

Updated by zzak (zzak _) over 10 years ago

See #7698 and https://github.com/seki/Drip/issues/4

On Nov 23, 2013, at 6:09 PM, "mame (Yusuke Endoh)" wrote:

Issue #9121 has been updated by mame (Yusuke Endoh).

knu (Akinori MUSHA) wrote:

rbtree is seemingly broken for the latest version of ruby.

What do you mean? What broke rbtree?
I'm afraid it is a more serious problem than this ticket itself.

--
Yusuke Endoh

Bug #9121: [PATCH] Remove rbtree implementation of SortedSet due to performance regression
https://bugs.ruby-lang.org/issues/9121#change-43101

Author: xshay (Xavier Shay)
Status: Assigned
Priority: Normal
Assignee: knu (Akinori MUSHA)
Category: lib
Target version:
ruby -v: 2.0.0-p247
Backport: 1.9.3: UNKNOWN, 2.0.0: UNKNOWN

rbtree is slower than the pure ruby version.

I have provided benchmarks and a patch here:
https://github.com/ruby/ruby/pull/451

ruby sorted_set_benchmark.rb
using rbtree
user system total real
#add 0.010000 0.000000 0.010000 ( 0.016446)
#delete 0.020000 0.000000 0.020000 ( 0.013248)
#include? 1000 items 0.010000 0.000000 0.010000 ( 0.011822)
#include? 2000 items 0.020000 0.000000 0.020000 ( 0.012572)
#include? 3000 items 0.020000 0.000000 0.020000 ( 0.013610)
#include? 4000 items 0.020000 0.000000 0.020000 ( 0.014295)
#include? 5000 items 0.010000 0.000000 0.010000 ( 0.018024)
#to_a 1000 items 0.580000 0.020000 0.600000 ( 0.616104)
#to_a 2000 items 1.170000 0.040000 1.210000 ( 1.213406)
#to_a 3000 items 1.730000 0.030000 1.760000 ( 1.773069)
#to_a 4000 items 2.370000 0.040000 2.410000 ( 2.420450)
#to_a 5000 items 2.920000 0.050000 2.970000 ( 2.975497)
ruby sorted_set_benchmark.rb
NOT using rbtree
user system total real
#add 0.010000 0.000000 0.010000 ( 0.007889)
#delete 0.010000 0.000000 0.010000 ( 0.004631)
#include? 1000 items 0.000000 0.000000 0.000000 ( 0.005060)
#include? 2000 items 0.010000 0.000000 0.010000 ( 0.005950)
#include? 3000 items 0.010000 0.000000 0.010000 ( 0.005814)
#include? 4000 items 0.010000 0.000000 0.010000 ( 0.005993)
#include? 5000 items 0.010000 0.000000 0.010000 ( 0.006923)
#to_a 1000 items 0.000000 0.000000 0.000000 ( 0.001863)
#to_a 2000 items 0.000000 0.000000 0.000000 ( 0.002145)
#to_a 3000 items 0.000000 0.000000 0.000000 ( 0.002129)
#to_a 4000 items 0.000000 0.000000 0.000000 ( 0.002265)
#to_a 5000 items 0.000000 0.000000 0.000000 ( 0.002428)

--
http://bugs.ruby-lang.org/

Updated by knu (Akinori MUSHA) over 10 years ago

mame (Yusuke Endoh) wrote:

knu (Akinori MUSHA) wrote:

rbtree is seemingly broken for the latest version of ruby.

What do you mean? What broke rbtree?

Try it yourself and you'll see. It relies on the internal data structure of RHash at some point of Ruby that lasted until 1.9.3.

I'm afraid it is a more serious problem than this ticket itself.

It is obviously a third-party issue and the point of this issue is that we should not depend on it any more, so I'll go ahead anyway.

Updated by knu (Akinori MUSHA) over 10 years ago

I wrote:

... and the point of this issue is that we should not depend on it any more, so I'll go ahead anyway.

Oops, I mistook this issue for something else, I just thought that's another reason to ditch the rbtree dependency albeit optional.

Updated by mame (Yusuke Endoh) over 10 years ago

zzak (Zachary Scott) wrote:

See #7698 and https://github.com/seki/Drip/issues/4

Thank you, but it works now for me.

$ ruby -v
ruby 2.0.0p353 (2013-11-22 revision 43784) [x86_64-linux]
$ gem -v
2.1.11
$ gem install rbtree
Building native extensions. This could take a while...
Successfully installed rbtree-0.4.1
Parsing documentation for rbtree-0.4.1
Done installing documentation for rbtree after 0 seconds
1 gem installed

knu (Akinori MUSHA) wrote:

What do you mean? What broke rbtree?

Try it yourself and you'll see. It relies on the internal data structure of RHash at some point of Ruby that lasted until 1.9.3.

I'm using rbtree on 2.0.0 everyday, but I have never encountered any problem.
Where was the issue discussed? Could you show me a pointer?

I'm afraid it is a more serious problem than this ticket itself.

It is obviously a third-party issue and the point of this issue is that we should not depend on it any more, so I'll go ahead anyway.

In principle, I agree that it is not cool for a standard library to depend
a third-party library. But in fact, (a part of) set.rb has depended on rbtree.
As Joel said, some operations will become very slow if it is removed.
Isn't it a compatibility problem?

--
Yusuke Endoh

Updated by knu (Akinori MUSHA) over 10 years ago

  • Target version set to 3.0

Maybe. And I noticed the second preview (not RC though) of 2.1.0 was out, so I'll postpone any change to SortedSet to the next major.

Updated by tmm1 (Aman Karmani) over 10 years ago

The following patch fixes rbtree on ruby 2.1: https://gist.github.com/tmm1/7609371

I emailed it to 8 days ago, but no response.

Updated by zzak (zzak _) over 10 years ago

Thank you tmm1!

There is also rbtree2 (https://github.com/kitak/rbtree2), that is also seemingly unmaintained, but perhaps they will be more willing to do a release. Ofcourse, it would be best if we could get OZAWA-san to respond to the patch and release.

Updated by knu (Akinori MUSHA) about 10 years ago

On second thought, if a working version rbtree is available out there (at least for CRuby and hopefully for JRuby) then I would like SortedSet to be moved from the standard library to a third-party gem.

The non-rbtree implementation of SortedSet should be seen as an ad hoc substitute, not a professional, production quality implementation to provide the performance characteristics that would be naturally expected from the name.

Updated by zzak (zzak _) about 10 years ago

I just wanted to remind everyone of the discussion that took place regarding adding RBTree to stdlib in [Feature #2348]

Updated by hsbt (Hiroshi SHIBATA) over 9 years ago

  • Related to Feature #2348: RBTree Should be Added to the Standard Library added
Actions #15

Updated by naruse (Yui NARUSE) over 3 years ago

  • Tracker changed from Bug to Feature
  • Target version deleted (3.0)
  • ruby -v deleted (2.0.0-p247)
  • Backport deleted (1.9.3: UNKNOWN, 2.0.0: UNKNOWN)

Updated by jeremyevans0 (Jeremy Evans) over 2 years ago

  • Status changed from Assigned to Closed
Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0