Project

General

Profile

Bug #17280

Dir.glob with FNM_DOTMATCH matches ".." and "." and results in duplicated entries

Added by Eregon (Benoit Daloze) about 1 month ago. Updated 7 days ago.

Status:
Open
Priority:
Normal
Assignee:
-
Target version:
-
ruby -v:
ruby 2.6.6p146 (2020-03-31 revision 67876) [x86_64-linux]
[ruby-core:100504]

Description

% ruby -e 'p Dir.glob("**/*", File::FNM_DOTMATCH)'
[".", "bar", "bar/.", "bar/.baz", "bar/.baz/.", "bar/.baz/qux"]
% ruby -e 'p Dir.glob("**", File::FNM_DOTMATCH)' 
[".", "..", "bar"]
% ruby -e 'p Dir.glob("*", File::FNM_DOTMATCH)' 
[".", "..", "bar"]

I think ".." was never intended by the user here, is it a bug?

Not sure about ".".

Note it also causes duplicated entries: bar and bar/.baz are twice in the Array!

I think .. should always be ignored for glob purposes, it escapes the current directory.
And . seems useless and causing duplicates.

I think the intention of users of File::FNM_DOTMATCH is to match file/directories starting with a . like .baz.
Probably Dir.glob("**/{*,.*}") is a safer way to achieve that,
but still I think FNM_DOTMATCH should not produce such weird results.

From https://github.com/oracle/truffleruby/issues/2116
I could not figure out what was the intended semantics for FNM_DOTMATCH with regards to . and ...


Related issues

Related to Ruby master - Bug #16831: Running `Pathname#glob` with `File::FNM_DOTMATCH` option loses `.` and `..`ClosedActions
Related to Ruby master - Bug #17283: Why does Dir.glob's ** match files in current directory?ClosedActions
#1

Updated by Eregon (Benoit Daloze) about 1 month ago

  • Related to Bug #16831: Running `Pathname#glob` with `File::FNM_DOTMATCH` option loses `.` and `..` added
#2

Updated by Eregon (Benoit Daloze) about 1 month ago

  • Related to Bug #17283: Why does Dir.glob's ** match files in current directory? added

Updated by jeremyevans0 (Jeremy Evans) 11 days ago

So there is two separate issues you are discussing. One issue is when the recursive glob is used ("**/*") where the same folder shows up under two paths, once in the parent directory and once in its own directory. The second issue is when a non-recursive glob is used that the reference to the parent folder (..) shows up (this is always excluded in recursive mode, even without FNM_DOTMATCH).

The first issue we can solve by skipping the current directory entry if it is . and we are in recursive mode and the current path (parent directory) matches the previous entry in the resulting array. This approach works as glob uses a depth-first search and not a breadth-first search, as long as . is the first entry in the directory.

The second issue we can solve by not matching .. entry if the glob is magical == 2 (* sets that).

I added a pull request with possible fixes: https://github.com/ruby/ruby/pull/3789. We'll see if it passes CI. Even if so, this changes behavior and I'm not sure I consider the current behavior in either case a bug, so we should get feedback from more committers as to what the expected behavior is.

Updated by Eregon (Benoit Daloze) 11 days ago

Thanks, that looks good to me.

I think everyone expects FNM_DOTMATCH to match dotfiles, and . and .. are not dotfiles.

Or is there is some other purpose for FNM_DOTMATCH?

Updated by Eregon (Benoit Daloze) 11 days ago

I wonder if we could simplify the logic to never include .. for Dir.glob, and only include . for the initial directory and nowhere else.
Are there cases where .. and . would ever be wanted, besides Dir.glob(".") => ["."]?

Not sure it's even a good idea to include . for Dir.glob("*", File::FNM_DOTMATCH), the intention of * seems clearly to list all files/subdirs under this directory, not this directory itself, and certainly not the parent directory.

Updated by Eregon (Benoit Daloze) 7 days ago

Currently, even Dir[".*"] and Dir["{*,.*}"] match .., which seems surprising.
The PR fixes those, and I think the resulting behavior is much more useful.

Also available in: Atom PDF