Bug #10015

Performance regression in Dir#[]

Added by Aaron Patterson 8 months ago. Updated 2 months ago.

[ruby-core:63591]
Status:Closed
Priority:Normal
Assignee:-
ruby -v:ruby 2.2.0dev (2014-02-04 trunk 44802) [x86_64-darwin13.0] Backport:2.0.0: UNKNOWN, 2.1: UNKNOWN

Description

r44802 seems to have introduced a performance regression in Dir#[].

Here is the test program:

require 'benchmark'

puts Benchmark.realtime {
  glob = "minitest/*_plugin.rb{,.rb,.bundle}"
  $LOAD_PATH.map { |load_path|
    Dir["#{File.expand_path glob, load_path}"]
  }.flatten.select { |file| File.file? file.untaint }
}

Here is the test time for me:

$ ruby -v test.rb
ruby 2.2.0dev (2014-02-04 trunk 44801) [x86_64-darwin13.0]
0.000341
$ ruby -v test.rb
ruby 2.2.0dev (2014-02-04 trunk 44802) [x86_64-darwin13.0]
0.009333

r44801 seems much faster than r44802.

Associated revisions

Revision 48973
Added by Nobuyoshi Nakada 2 months ago

ChangeLog: fix ref of r48972. [Bug #10015]

Revision 48973
Added by Nobuyoshi Nakada 2 months ago

ChangeLog: fix ref of r48972. [Bug #10015]

Revision 48975
Added by Nobuyoshi Nakada 2 months ago

dir.c: shortcut for case-insensitive name

  • dir.c (glob_helper): shortcut for case-insensitive name by stopping reading after a matching name found. [Bug #10015]

Revision 48975
Added by Nobuyoshi Nakada 2 months ago

dir.c: shortcut for case-insensitive name

  • dir.c (glob_helper): shortcut for case-insensitive name by stopping reading after a matching name found. [Bug #10015]

Revision 48990
Added by Nobuyoshi Nakada 2 months ago

dir.c: replace_real_basename

  • dir.c (replace_real_basename): get the real name and replace the base name with it by getattrlist(2) if available. suggested by Matthew Draper at . [Bug #10015]
  • dir.c (glob_helper): get the real name of the whole path, not only the last name.

Revision 48990
Added by Nobuyoshi Nakada 2 months ago

dir.c: replace_real_basename

  • dir.c (replace_real_basename): get the real name and replace the base name with it by getattrlist(2) if available. suggested by Matthew Draper at . [Bug #10015]
  • dir.c (glob_helper): get the real name of the whole path, not only the last name.

History

#1 Updated by Nobuyoshi Nakada 8 months ago

  • Description updated (diff)

Yes.
Or use case-sensitive platforms.

#2 Updated by Aaron Patterson 8 months ago

@nobu this change causes a significant increase in Rails boot time on my system.

My application pays a 120ms price for this change. @hsbt will pay 520ms:

http://twitter.com/hsbt/status/486378979138367488

Is there a way we can keep the performance on case-insensitive file systems? If there really is no work-around, then we need to make some changes in RubyGems, bundler, minitest, and Rails before trunk is released because many people will complain about slow boot times. :(

#3 Updated by Nobuyoshi Nakada 8 months ago

It might affect unintentionally.
I'll check the bottleneck.

#4 Updated by Aaron Patterson 8 months ago

Thanks nobu, I really appreciate it! If it can't be fixed, I have ideas for avoiding the calls to Dir#[], but it will take time to get merged and released.

#5 Updated by Eric Wong 8 months ago

Btw, you're already paying significant costs for a case-insensitive FS
(not just Ruby, but things like git, too).

#6 Updated by Nobuyoshi Nakada 2 months ago

  • Status changed from Open to Closed
  • % Done changed from 0 to 100

Applied in changeset r48973.


ChangeLog: fix ref of r48972. [Bug #10015]

#7 Updated by Tomoyuki Chikanaga 2 months ago

note: r48975 was reverted at r48976.

#8 Updated by Matthew Draper 2 months ago

I don't know what I'm talking about, but this seems like it could use fcntl(2) + F_GETPATH, or getattrlist(2) + ATTR_CMN_NAME.. or something along those lines -- we shouldn't need the loop+fnmatch at all, should we?

#9 Updated by Nobuyoshi Nakada 2 months ago

Thanks, that's it.

Also available in: Atom PDF