Project

General

Profile

Actions

Feature #8820

closed

Speed up Array#index

Added by trans (Thomas Sawyer) over 10 years ago. Updated over 10 years ago.

Status:
Closed
Assignee:
-
Target version:
[ruby-core:56809]

Description

I did a quick comparison:

In Ruby

def main
  n = 10000000  # ten million
  a = randPerm(100)

  t0 = Time.now

  n.times do |i|
    a.index(i)
  end

  puts "%.5f" % [Time.now - t0]
end

def randPerm(n)
  (0...n).sort_by{rand}
end

main()

In Go

package main

import (
  "fmt"
  "time"
  "math/rand"
)

func main() {
  n := 10000000  // ten million
  a := rand.Perm(100)

  t0 := time.Now()

  for i := 0; i < n; i++ {
    index(a, i)
  }

  fmt.Printf("%.5f\n", time.Now().Sub(t0).Seconds())
}

func index(slice []int, value int) int {
  for i, v := range slice {
    if (v == value) {
      return i
    }
  }
  return -1
}

The result

Ruby: 71.08961 secs
Go: 2.61975 secs

That's pretty huge difference (and worse I was told my Go index function was "crazily inefficient" too, though personally I don't see how it can be any better). So, I thought I'd mention it. Maybe it would be possible to speed up.

Updated by normalperson (Eric Wong) over 10 years ago

"trans (Thomas Sawyer)" wrote:

def main
  n = 10000000  # ten million
  a = randPerm(100)

  t0 = Time.now

  n.times do |i|
    a.index(i)
  end

  puts "%.5f" % [Time.now - t0]
end

def randPerm(n)
  (0...n).sort_by{rand}
end

The performance of your code varies between runs because the
ordering is always different and index is O(n) worst case.
call srand(0) before any rand calls to get a consistent seed.

I suspect your Go code has the same flaw (but I don't know Go)

The result

Ruby: 71.08961 secs
Go: 2.61975 secs

That's pretty huge difference (and worse I was told my Go index
function was "crazily inefficient" too, though personally I don't see
how it can be any better). So, I thought I'd mention it. Maybe it
would be possible to speed up

From what I can tell, rb_ary_index in array.c doesn't use the optimized
== definition in insns.def (which inlines some common comparisions) to
avoid rb_funcall overhead.

Maybe that can help this case...

Otoh, I would never use anything like Array#index on large arrays and/or
hot code in the first place. After all these years of Ruby, I've hardly
ever used Array#index. The only time in recent memory I used it was on
a 5-element array of short strings.

Updated by Anonymous over 10 years ago

On 08/26/2013 12:57 PM, Eric Wong wrote:

"trans (Thomas Sawyer)" wrote:

 def main
   n = 10000000  # ten million
   a = randPerm(100)

   t0 = Time.now

   n.times do |i|
     a.index(i)
   end

   puts "%.5f" % [Time.now - t0]
 end

 def randPerm(n)
   (0...n).sort_by{rand}
 end

The performance of your code varies between runs because the
ordering is always different and index is O(n) worst case.
call srand(0) before any rand calls to get a consistent seed.

The running time of this code won't vary much at all. The n=10000000
setting is much higher than a.size, so most #index calls will return
nil. The entire array is searched for almost all iterations.

Maybe the intent was for each iteration step to do this

a.index(i%n)

?

Updated by normalperson (Eric Wong) over 10 years ago

Joel VanderWerf wrote:

On 08/26/2013 12:57 PM, Eric Wong wrote:

The performance of your code varies between runs because the
ordering is always different and index is O(n) worst case.
call srand(0) before any rand calls to get a consistent seed.

The running time of this code won't vary much at all. The n=10000000
setting is much higher than a.size, so most #index calls will return
nil. The entire array is searched for almost all iterations.

I think you're right. I was on a new machine (Haswell) and enabled a
bunch kernel options I normally don't use (Hyper-threading,
full tickless, full preempt, automatic process group scheduling).
Lots of variables even when the system isn't loaded :o

Updated by trans (Thomas Sawyer) over 10 years ago

Yes, sorry. I meant to use a random index with each iteration, not i. But per the suggestion, I think i % 100 would be better.

I changed and reran the benchmarks. But even so the comparison still comes out about the same ratio:

Ruby: 35.66597
Go: 1.39305

Actions #5

Updated by nobu (Nobuyoshi Nakada) over 10 years ago

  • Status changed from Open to Closed
  • % Done changed from 0 to 100

This issue was solved with changeset r42704.
Thomas, thank you for reporting this issue.
Your contribution to Ruby is greatly appreciated.
May Ruby be with you.


array.c: optimized equality

  • array.c (rb_ary_index, rb_ary_rindex): use optimized equality to
    improve performance. [Feature #8820]
  • vm_insnhelper.c (rb_equal_opt): optimized equality function.
Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0Like0