Bug #10015

Performance regression in Dir#[]

Added by Aaron Patterson 9 months ago. Updated 2 days ago.

[ruby-core:63591]
Status:Feedback
Priority:Normal
Assignee:-
ruby -v:ruby 2.2.0dev (2014-02-04 trunk 44802) [x86_64-darwin13.0] Backport:2.0.0: UNKNOWN, 2.1: UNKNOWN

Description

r44802 seems to have introduced a performance regression in Dir#[].

Here is the test program:

require 'benchmark'

puts Benchmark.realtime {
  glob = "minitest/*_plugin.rb{,.rb,.bundle}"
  $LOAD_PATH.map { |load_path|
    Dir["#{File.expand_path glob, load_path}"]
  }.flatten.select { |file| File.file? file.untaint }
}

Here is the test time for me:

$ ruby -v test.rb
ruby 2.2.0dev (2014-02-04 trunk 44801) [x86_64-darwin13.0]
0.000341
$ ruby -v test.rb
ruby 2.2.0dev (2014-02-04 trunk 44802) [x86_64-darwin13.0]
0.009333

r44801 seems much faster than r44802.

Associated revisions

Revision 48973
Added by Nobuyoshi Nakada 3 months ago

ChangeLog: fix ref of r48972. [Bug #10015]

Revision 48973
Added by Nobuyoshi Nakada 3 months ago

ChangeLog: fix ref of r48972. [Bug #10015]

Revision 48975
Added by Nobuyoshi Nakada 3 months ago

dir.c: shortcut for case-insensitive name

  • dir.c (glob_helper): shortcut for case-insensitive name by stopping reading after a matching name found. [Bug #10015]

Revision 48975
Added by Nobuyoshi Nakada 3 months ago

dir.c: shortcut for case-insensitive name

  • dir.c (glob_helper): shortcut for case-insensitive name by stopping reading after a matching name found. [Bug #10015]

Revision 48990
Added by Nobuyoshi Nakada 3 months ago

dir.c: replace_real_basename

  • dir.c (replace_real_basename): get the real name and replace the base name with it by getattrlist(2) if available. suggested by Matthew Draper at . [Bug #10015]
  • dir.c (glob_helper): get the real name of the whole path, not only the last name.

Revision 48990
Added by Nobuyoshi Nakada 3 months ago

dir.c: replace_real_basename

  • dir.c (replace_real_basename): get the real name and replace the base name with it by getattrlist(2) if available. suggested by Matthew Draper at . [Bug #10015]
  • dir.c (glob_helper): get the real name of the whole path, not only the last name.

History

#1 Updated by Nobuyoshi Nakada 9 months ago

  • Description updated (diff)

Yes.
Or use case-sensitive platforms.

#2 Updated by Aaron Patterson 9 months ago

@nobu this change causes a significant increase in Rails boot time on my system.

My application pays a 120ms price for this change. @hsbt will pay 520ms:

http://twitter.com/hsbt/status/486378979138367488

Is there a way we can keep the performance on case-insensitive file systems? If there really is no work-around, then we need to make some changes in RubyGems, bundler, minitest, and Rails before trunk is released because many people will complain about slow boot times. :(

#3 Updated by Nobuyoshi Nakada 9 months ago

It might affect unintentionally.
I'll check the bottleneck.

#4 Updated by Aaron Patterson 9 months ago

Thanks nobu, I really appreciate it! If it can't be fixed, I have ideas for avoiding the calls to Dir#[], but it will take time to get merged and released.

#5 Updated by Eric Wong 9 months ago

Btw, you're already paying significant costs for a case-insensitive FS
(not just Ruby, but things like git, too).

#6 Updated by Nobuyoshi Nakada 3 months ago

  • Status changed from Open to Closed
  • % Done changed from 0 to 100

Applied in changeset r48973.


ChangeLog: fix ref of r48972. [Bug #10015]

#7 Updated by Tomoyuki Chikanaga 3 months ago

note: r48975 was reverted at r48976.

#8 Updated by Matthew Draper 3 months ago

I don't know what I'm talking about, but this seems like it could use fcntl(2) + F_GETPATH, or getattrlist(2) + ATTR_CMN_NAME.. or something along those lines -- we shouldn't need the loop+fnmatch at all, should we?

#9 Updated by Nobuyoshi Nakada 3 months ago

Thanks, that's it.

#10 Updated by Adrien Siami 5 days ago

I've noticed a significant slowdown of a rails app when using ruby 2.2.1 rather than 2.1.5.

A slow page was running in 16 seconds with 2.2.1 where it was running in less than 3 seconds with 2.1.5

I tested this very simple benchmark :

require 'benchmark'

puts Benchmark.realtime {
  800000.times do
    Dir['/tmp']
  end
}

it runs in 3.1289695860032225 secs with 2.2.1 and 2.403 secs with 2.1.5

Is this issue still present ?

#11 Updated by Nobuyoshi Nakada 4 days ago

  • Status changed from Closed to Feedback

Adrien Siami wrote:

A slow page was running in 16 seconds with 2.2.1 where it was running in less than 3 seconds with 2.1.5
it runs in 3.1289695860032225 secs with 2.2.1 and 2.403 secs with 2.1.5

What's your platform and filesystem?
As the former seems pretty bigger than the later, they may be caused by different reasons.

#12 Updated by Adrien Siami 2 days ago

Mac OS X Yosemite with the standard filesystem, I believe HFS+ Insensitive

Also available in: Atom PDF