Project

General

Profile

Actions

Feature #10714

closed

Array#reject! nonlinear performance problem

Added by akr (Akira Tanaka) about 10 years ago. Updated about 10 years ago.

Status:
Closed
Target version:
-
[ruby-core:67418]

Description

I found Array#reject! is too slow.

I measured it and it seems the performance is nonlinear.

% ./ruby -v -e '
20.times {|i|
  a = [nil]*i*10000;
  t1 = Time.now
  a.reject! { true }
  t2 = Time.now
  t = t2 - t1
  p ["*" * (t * 20).to_i , t]
}
'
ruby 2.3.0dev (2015-01-08 trunk 49175) [x86_64-linux]
["", 3.683e-06]
["", 0.019059723]
["*", 0.052964771]
["**", 0.1177318]
["****", 0.208824818]
["******", 0.334757354]
["*********", 0.482717139]
["*************", 0.669606441]
["*****************", 0.866588588]
["**********************", 1.116195389]
["***************************", 1.392828177]
["**********************************", 1.701906753]
["****************************************", 2.013290644]
["************************************************", 2.415258165]
["*******************************************************", 2.783918449]
["*****************************************************************", 3.27417584]
["**************************************************************************", 3.724958298]
["**************************************************************************************", 4.307263787]
["**************************************************************************************************", 4.922179118]
["************************************************************************************************************", 5.403641168]

Ruby 2.2, 2.1, 2.0, 1.9.3 also have the problem but Ruby 1.9.2 works well.

% ruby-1.9.2-p330 -v -e ' 
20.times {|i|
  a = [nil]*i*10000;
  t1 = Time.now
  a.reject! { true }
  t2 = Time.now
  t = t2 - t1
  p ["*" * (t * 20).to_i , t]
}
'
ruby 1.9.2p330 (2014-08-07 revision 47094) [x86_64-linux]
["", 2.111e-06]
["", 0.000798623]
["", 0.001441408]
["", 0.00155386]
["", 0.001656242]
["", 0.002166389]
["", 0.002355492]
["", 0.002703977]
["", 0.003123692]
["", 0.00348722]
["", 0.003884792]
["", 0.004300034]
["", 0.004701378]
["", 0.006854893]
["", 0.005485207]
["", 0.005972309]
["", 0.006298597]
["", 0.006901775]
["", 0.007216343]
["", 0.007373332]

Related issues 3 (0 open3 closed)

Related to Ruby master - Bug #2545: Array#delete_if is borked if user calls 'break'Closedmatz (Yukihiro Matsumoto)01/02/2010Actions
Related to Ruby master - Bug #5752: Array#delete_if side effects due to changeset r32360Closed12/13/2011Actions
Related to Ruby master - Bug #10722: Array#keep_if is borked if user calls 'break'Closed01/09/2015Actions

Updated by wanabe (_ wanabe) about 10 years ago

According to git bisect, it caused by r32373 (related to [Bug #2545]) .

Updated by nobu (Nobuyoshi Nakada) about 10 years ago

  • Related to Bug #2545: Array#delete_if is borked if user calls 'break' added

Updated by nobu (Nobuyoshi Nakada) about 10 years ago

  • Status changed from Open to Rejected

The target of Array#reject! is the receiver itself, so the modification can be observed from the given block, or the block can exit by break.
Therefore the compaction is necessary each times.

I can't think of the way to avoid it, unless we have a way to guarantee that an object is never referred from a block, directly or indirectly.

Updated by akr (Akira Tanaka) about 10 years ago

I don't think the modification must be observable from the given block.

[Bug #2545] doesn't discuss the observability in the block.
It discusses the receiver after "break".

If the modification is not required to be observable from the block,
it is possible to implement with linear performance.

% ruby -e '
class Array
  def my_reject!
    i = 0
    j = 0
    while i < self.length
      v = self[i]
      unless yield v
        self[j] = v
        j += 1
      end
      i += 1
    end
    i != j ? self : nil
  ensure
    if i != j
      self[j, i-j] = []
    end
  end
end
a = [1,2,3]; a.my_reject! {|v| v == 2 }; p a
a = [5,6,7,8,9,10]; a.my_reject! {|i| break i if i > 8; i < 7}; p a
'
[1, 3]
[7, 8, 9, 10]

So the problem is which is important between the linear performance and the observability.

I vote the linear performance because:

  • the nonlinear performance makes reject! useless for long array and
  • [Bug #2545] is not a issue for the observability.

Updated by nobu (Nobuyoshi Nakada) about 10 years ago

I know, and make test-all failed.

Updated by akr (Akira Tanaka) about 10 years ago

  • Related to Bug #5752: Array#delete_if side effects due to changeset r32360 added

Updated by akr (Akira Tanaka) about 10 years ago

Hm.
The failed test is introduced by you after [Bug #5752].

The bug report describes as "If this is indeed the intended behaviour, ...".
It means that the reporter is not sure that the current behavior is intended or not.

Why current behavior is desired over linear performance behavior?

Updated by nobu (Nobuyoshi Nakada) about 10 years ago

Ok, then try if something dies?

diff --git a/array.c b/array.c
index 0de7231..f2f7352 100644
--- a/array.c
+++ b/array.c
@@ -2824,6 +2824,48 @@ rb_ary_select(VALUE ary)
     return result;
 }
 
+struct select_bang_arg {
+    VALUE ary;
+    long len[2];
+};
+
+static VALUE
+select_bang_i(VALUE a)
+{
+    volatile struct select_bang_arg *arg = (void *)a;
+    VALUE ary = arg->ary;
+    long i1, i2;
+
+    for (i1 = i2 = 0; i1 < RARRAY_LEN(ary); arg->len[0] = ++i1) {
+	VALUE v = RARRAY_AREF(ary, i1);
+	if (!RTEST(rb_yield(v))) continue;
+	if (i1 != i2) {
+	    rb_ary_store(ary, i2, v);
+	}
+	arg->len[1] = ++i2;
+    }
+    return (i1 == i2) ? Qnil : ary;
+}
+
+static VALUE
+select_bang_ensure(VALUE a)
+{
+    volatile struct select_bang_arg *arg = (void *)a;
+    VALUE ary = arg->ary;
+    long len = RARRAY_LEN(ary);
+    long i1 = arg->len[0], i2 = arg->len[1];
+
+    if (i2 < i1) {
+	if (i1 < len) {
+	    RARRAY_PTR_USE(ary, ptr, {
+		    MEMMOVE(ptr + i2, ptr + i1, VALUE, len - i1);
+		});
+	}
+	ARY_SET_LEN(ary, len - i1 + i2);
+    }
+    return ary;
+}
+
 /*
  *  call-seq:
  *     ary.select!  {|item| block } -> ary or nil
@@ -2832,6 +2874,8 @@ rb_ary_select(VALUE ary)
  *  Invokes the given block passing in successive elements from +self+,
  *  deleting elements for which the block returns a +false+ value.
  *
+ *  The array may not be changed instantly every time the block is called.
+ *
  *  If changes were made, it will return +self+, otherwise it returns +nil+.
  *
  *  See also Array#keep_if
@@ -2843,22 +2887,14 @@ rb_ary_select(VALUE ary)
 static VALUE
 rb_ary_select_bang(VALUE ary)
 {
-    long i;
-    VALUE result = Qnil;
+    struct select_bang_arg args;
 
     RETURN_SIZED_ENUMERATOR(ary, 0, 0, ary_enum_length);
     rb_ary_modify(ary);
-    for (i = 0; i < RARRAY_LEN(ary); ) {
-	VALUE v = RARRAY_AREF(ary, i);
-	if (!RTEST(rb_yield(v))) {
-	    rb_ary_delete_at(ary, i);
-	    result = ary;
-	}
-	else {
-	    i++;
-	}
-    }
-    return result;
+
+    args.ary = ary;
+    args.len[0] = args.len[1] = 0;
+    return rb_ensure(select_bang_i, (VALUE)&args, select_bang_ensure, (VALUE)&args);
 }
 
 /*
@@ -3101,23 +3137,32 @@ ary_reject(VALUE orig, VALUE result)
 }
 
 static VALUE
-ary_reject_bang(VALUE ary)
+reject_bang_i(VALUE a)
 {
-    long i;
-    VALUE result = Qnil;
+    volatile struct select_bang_arg *arg = (void *)a;
+    VALUE ary = arg->ary;
+    long i1, i2;
 
-    rb_ary_modify_check(ary);
-    for (i = 0; i < RARRAY_LEN(ary); ) {
-	VALUE v = RARRAY_AREF(ary, i);
-	if (RTEST(rb_yield(v))) {
-	    rb_ary_delete_at(ary, i);
-	    result = ary;
-	}
-	else {
-	    i++;
+    for (i1 = i2 = 0; i1 < RARRAY_LEN(ary); arg->len[0] = ++i1) {
+	VALUE v = RARRAY_AREF(ary, i1);
+	if (RTEST(rb_yield(v))) continue;
+	if (i1 != i2) {
+	    rb_ary_store(ary, i2, v);
 	}
+	arg->len[1] = ++i2;
     }
-    return result;
+    return (i1 == i2) ? Qnil : ary;
+}
+
+static VALUE
+ary_reject_bang(VALUE ary)
+{
+    struct select_bang_arg args;
+
+    rb_ary_modify_check(ary);
+    args.ary = ary;
+    args.len[0] = args.len[1] = 0;
+    return rb_ensure(reject_bang_i, (VALUE)&args, select_bang_ensure, (VALUE)&args);
 }
 
 /*
@@ -3128,8 +3173,7 @@ ary_reject_bang(VALUE ary)
  *  Equivalent to Array#delete_if, deleting elements from +self+ for which the
  *  block evaluates to +true+, but returns +nil+ if no changes were made.
  *
- *  The array is changed instantly every time the block is called, not after
- *  the iteration is over.
+ *  The array may not be changed instantly every time the block is called.
  *
  *  See also Enumerable#reject and Array#delete_if.
  *

Updated by nobu (Nobuyoshi Nakada) about 10 years ago

Forgot a patch of the tests

diff --git a/test/ruby/test_array.rb b/test/ruby/test_array.rb
index 31f33dd..33fc5d6 100644
--- a/test/ruby/test_array.rb
+++ b/test/ruby/test_array.rb
@@ -661,7 +661,7 @@ class TestArray < Test::Unit::TestCase
 
     bug2545 = '[ruby-core:27366]'
     a = @cls[ 5, 6, 7, 8, 9, 10 ]
-    assert_equal(9, a.delete_if {|i| break i if i > 8; assert_equal(a[0], i) || true if i < 7})
+    assert_equal(9, a.delete_if {|i| break i if i > 8; i < 7})
     assert_equal(@cls[7, 8, 9, 10], a, bug2545)
   end
 
@@ -1160,7 +1160,7 @@ class TestArray < Test::Unit::TestCase
 
     bug2545 = '[ruby-core:27366]'
     a = @cls[ 5, 6, 7, 8, 9, 10 ]
-    assert_equal(9, a.reject! {|i| break i if i > 8; assert_equal(a[0], i) || true if i < 7})
+    assert_equal(9, a.reject! {|i| break i if i > 8; i < 7})
     assert_equal(@cls[7, 8, 9, 10], a, bug2545)
   end
 
Actions #10

Updated by akr (Akira Tanaka) about 10 years ago

Ok, then try if something dies?

I think we should try.

We can explain the reason of this change.

It may be a good idea to describe this issue in NEWS because a documented behavior changed, though.

Updated by akr (Akira Tanaka) about 10 years ago

  • Related to Bug #10722: Array#keep_if is borked if user calls 'break' added

Updated by duerst (Martin Dürst) about 10 years ago

Akira Tanaka wrote:

So the problem is which is important between the linear performance and the observability.

I vote the linear performance because:

  • the nonlinear performance makes reject! useless for long array and
  • [Bug #2545] is not a issue for the observability.

I fully agree. Nonlinear performance is a killer. Observability and ability to break are just nice-to-have. I think we should reopen the bug.

Updated by akr (Akira Tanaka) about 10 years ago

  • Tracker changed from Bug to Feature

Now, I think this issue is a feature instead of a bug because it changes a documented behavior.

Updated by zzak (zzak _) about 10 years ago

Using the following benchmark I compared nobu's patch vs. 2.2:

require 'derailed_benchmarks'
require 'derailed_benchmarks/tasks'

namespace :perf do
  desc "Array#select! and friends"
  task :array_select => [:setup] do
    Benchmark.ips do |x|
      [10, 100, 1000].map do |i|
        a = []
        i.times { |z| a = [nil]*z*10000 }

        x.report("Array#select! * #{i}") { a.select! { true } }
      end

      x.compare!
    end
  end
end

Here are the results:

2.2

Calculating -------------------------------------
  Array#select! * 10    24.000  i/100ms
 Array#select! * 100     2.000  i/100ms
Array#select! * 1000     1.000  i/100ms
-------------------------------------------------
  Array#select! * 10    243.090  (± 5.8%) i/s -      1.224k
 Array#select! * 100     22.046  (± 4.5%) i/s -    110.000
Array#select! * 1000      2.200  (± 0.0%) i/s -     11.000

Comparison:
  Array#select! * 10:      243.1 i/s
 Array#select! * 100:       22.0 i/s - 11.03x slower
Array#select! * 1000:        2.2 i/s - 110.48x slower

Patched

Calculating -------------------------------------
  Array#select! * 10    23.000  i/100ms
 Array#select! * 100     2.000  i/100ms
Array#select! * 1000     1.000  i/100ms
-------------------------------------------------
  Array#select! * 10    234.708  (± 6.0%) i/s -      1.173k
 Array#select! * 100     21.559  (± 4.6%) i/s -    108.000
Array#select! * 1000      2.129  (± 0.0%) i/s -     11.000

Comparison:
  Array#select! * 10:      234.7 i/s
 Array#select! * 100:       21.6 i/s - 10.89x slower
Array#select! * 1000:        2.1 i/s - 110.25x slower

Updated by funny_falcon (Yura Sokolov) about 10 years ago

Zachary, patch fixes reject! and you test select!, ie patch fixes case when most items were deleted, and you test case when no item is deleted.

Based on numbers you present, patched version is so close to unpatched so I could not tell real difference between.

Updated by zzak (zzak _) about 10 years ago

Ooops, there was a bug in my test, and after fixing it and updating the results above it seems that nobu's patch improves things slightly.

I'm going to post the results for both select! and reject! soon.

@Yura, you can see that nobu's patch also to #select! here: https://github.com/ruby/ruby/pull/811/files#diff-24411fb2a634ee46e29925b04767d15eR2851

Updated by zzak (zzak _) about 10 years ago

  • Status changed from Rejected to Open

Updated by zzak (zzak _) about 10 years ago

Here are the results for reject! using this benchmark:

require 'derailed_benchmarks'
require 'derailed_benchmarks/tasks'

namespace :perf do
  desc "Array#reject!"
  task :array_reject => [:setup] do
    Benchmark.ips do |x|
      20.times do |i|
        a = []
        i.times { |z| a = [nil]*z*10000 }

        x.report("Array#reject! * #{i}") { a.reject! { true } }
      end
    end
  end
end

2.2

   Array#reject! * 0      8.236M (± 6.4%) i/s -     40.928M
   Array#reject! * 1      8.255M (± 5.9%) i/s -     41.049M
   Array#reject! * 2      8.298M (± 5.5%) i/s -     41.297M
   Array#reject! * 3      8.291M (± 5.7%) i/s -     41.243M
   Array#reject! * 4      8.244M (± 7.0%) i/s -     40.901M
   Array#reject! * 5      8.148M (± 8.0%) i/s -     40.375M
   Array#reject! * 6      8.247M (± 6.6%) i/s -     40.939M
   Array#reject! * 7      8.309M (± 5.3%) i/s -     41.319M
   Array#reject! * 8      8.313M (± 5.1%) i/s -     41.369M
   Array#reject! * 9      8.332M (± 4.0%) i/s -     41.522M
  Array#reject! * 10      8.331M (± 4.0%) i/s -     41.491M
  Array#reject! * 11      8.281M (± 5.6%) i/s -     41.106M
  Array#reject! * 12      8.162M (± 8.2%) i/s -     39.586M
  Array#reject! * 13      1.006M (±17.3%) i/s -    743.452k
  Array#reject! * 14      1.007M (±17.0%) i/s -    743.165k
  Array#reject! * 15      1.007M (±17.0%) i/s -    746.140k
  Array#reject! * 16      1.006M (±17.2%) i/s -    747.039k
  Array#reject! * 17      1.007M (±17.2%) i/s -    752.532k
  Array#reject! * 18      1.006M (±17.2%) i/s -    741.489k
  Array#reject! * 19      1.006M (±17.3%) i/s -    745.300k

Patched

   Array#reject! * 0      7.285M (± 5.8%) i/s -     36.227M
   Array#reject! * 1      7.294M (± 6.5%) i/s -     36.274M
   Array#reject! * 2      7.378M (± 4.1%) i/s -     36.831M
   Array#reject! * 3      7.260M (± 7.8%) i/s -     35.974M
   Array#reject! * 4      7.385M (± 4.3%) i/s -     36.849M
   Array#reject! * 5      7.376M (± 5.5%) i/s -     36.735M
   Array#reject! * 6      7.145M (±10.2%) i/s -     35.136M
   Array#reject! * 7      6.974M (± 9.6%) i/s -     34.385M
   Array#reject! * 8      7.279M (± 5.7%) i/s -     36.259M
   Array#reject! * 9      7.333M (± 5.3%) i/s -     36.532M
  Array#reject! * 10      7.345M (± 5.5%) i/s -     36.554M
  Array#reject! * 11      7.328M (± 5.3%) i/s -     36.481M
  Array#reject! * 12      7.325M (± 3.2%) i/s -     36.579M
  Array#reject! * 13      7.319M (± 4.1%) i/s -     36.522M
  Array#reject! * 14      7.335M (± 5.5%) i/s -     36.527M
  Array#reject! * 15      7.338M (± 6.0%) i/s -     36.480M
  Array#reject! * 16      7.405M (± 5.2%) i/s -     36.901M
  Array#reject! * 17      7.151M (± 4.8%) i/s -     35.624M
  Array#reject! * 18      7.378M (± 5.7%) i/s -     36.714M
  Array#reject! * 19      7.414M (± 4.4%) i/s -     36.952M

Seems to have fixed the non-linear regression by this benchmark.

Updated by akr (Akira Tanaka) about 10 years ago

I think nobu has a right to accept this feature because the documented bahavior which will be changed was implemented and committed by nobu and others don't requested.

If nobu don't think so, we need approval of matz or naruse.

Updated by nobu (Nobuyoshi Nakada) about 10 years ago

I negate the part "others don't requested", ... somebody requested it, ... perhaps.

Updated by matz (Yukihiro Matsumoto) about 10 years ago

  • Assignee set to akr (Akira Tanaka)

I agree having linear performance is crucial, and the proposed new behavior is acceptable.
Go ahead, and make the change.

Matz.

Updated by akr (Akira Tanaka) about 10 years ago

Thank you, matz.

Please commit, nobu.

Updated by nobu (Nobuyoshi Nakada) about 10 years ago

  • Status changed from Open to Closed
  • % Done changed from 0 to 100

Applied in changeset r49255.


array.c: linear performance

Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0