Bug #9618

Pathname#cleanpath creates mixed path separators

Added by Daniel Rikowski 12 months ago. Updated 6 months ago.

[ruby-core:61402]
Status:Closed
Priority:Low
Assignee:cruby-windows
ruby -v:ruby 2.0.0p451 (2014-02-24) [i386-mingw32] Backport:2.0.0: DONE, 2.1: DONE

Description

When using Pathname#cleanpath with a Windows path the resulting path contains a mixture of slashes and backslashes.

require 'pathname'
path = Pathname.new('c:\projects\ruby\bug\test.rb')
path.to_s               # => "c:\\projects\\ruby\\bug\\test.rb"
path.cleanpath.to_s     # => "c:\\projects/ruby/bug/test.rb"

I'd expect cleanpath to use the same path separator for all path segments. The problem doesn't happen on non-Windows platforms because there backslashes are not detected as path separators.

The problem is that the first path segment is added verbatim and only subsequent segments are joined by File::join.

Personally I'd prefer it to use File::SEPARATOR only, regardless of any original separator(s). That way it would blend with the current 'normalizing' behaviour of cleanpath, which then could be also used to normalize any existing separator weirdness and - for example - make a path compatible with Dir.glob (which can't use backslashes)

Associated revisions

Revision 45827
Added by Akira Tanaka 10 months ago

  • ext/pathname/lib/pathname.rb (cleanpath_aggressive): make all
    separators File::SEPARATOR from File::ALT_SEPARATOR.
    Reported by Daniel Rikowski.
    Fixed by Nobuyoshi Nakada. [Bug #9618]

  • ext/pathname/lib/pathname.rb (cleanpath_conservative): ditto.

Revision 45827
Added by Akira Tanaka 10 months ago

  • ext/pathname/lib/pathname.rb (cleanpath_aggressive): make all
    separators File::SEPARATOR from File::ALT_SEPARATOR.
    Reported by Daniel Rikowski.
    Fixed by Nobuyoshi Nakada. [Bug #9618]

  • ext/pathname/lib/pathname.rb (cleanpath_conservative): ditto.

Revision 46911
Added by Tomoyuki Chikanaga 7 months ago

merge revision(s) r45827: [Backport #9618]

* ext/pathname/lib/pathname.rb (cleanpath_aggressive): make all
  separators File::SEPARATOR from File::ALT_SEPARATOR.
  Reported by Daniel Rikowski.
  Fixed by Nobuyoshi Nakada.  [Bug #9618]

* ext/pathname/lib/pathname.rb (cleanpath_conservative): ditto.

Revision 47337
Added by Usaku NAKAMURA 6 months ago

merge revision(s) 45827: [Backport #9618]

* ext/pathname/lib/pathname.rb (cleanpath_aggressive): make all
  separators File::SEPARATOR from File::ALT_SEPARATOR.
  Reported by Daniel Rikowski.
  Fixed by Nobuyoshi Nakada.  [Bug #9618]

* ext/pathname/lib/pathname.rb (cleanpath_conservative): ditto.

History

#1 Updated by Benoit Daloze 12 months ago

I agree on using / only in Pathname (converting in #initialize).

#2 Updated by Nobuyoshi Nakada 12 months ago

Agree.

diff --git a/ext/pathname/pathname.c b/ext/pathname/pathname.c
index 3db97fc..a68ecd2 100644
--- a/ext/pathname/pathname.c
+++ b/ext/pathname/pathname.c
@@ -17,6 +17,12 @@ get_strpath(VALUE obj)
 static void
 set_strpath(VALUE obj, VALUE val)
 {
+    VALUE sep[2];
+    sep[0] = rb_const_get(rb_cFile, rb_intern("ALT_SEPARATOR"));
+    if (!NIL_P(sep[0])) {
+   sep[1] = rb_const_get(rb_cFile, rb_intern("SEPARATOR"));
+   rb_funcallv(val, rb_intern("tr!"), 2, sep);
+    }
     rb_ivar_set(obj, id_at_path, val);
 }

#3 Updated by Daniel Rikowski 12 months ago

I'm not sure if converting the separators in #initialize is the best way to go. I can see the advantages, but I can also imagine existing programs breaking because of that.

Personally I'd prefer a more conservative approach which would remove ALT_SEPARATOR in #cleanpath only.

That way #initialize would gain a new (minor) feature as opposed to Pathname losing a (major) feature. (i.e. the possibility to store file names in the native format of the current OS)

#4 Updated by Benoit Daloze 12 months ago

There might be the special case of UNC paths, does anyone know if UNC paths with forward slashes work on Windows?

#5 Updated by Nobuyoshi Nakada 12 months ago

It works.

#6 Updated by Daniel Rikowski 12 months ago

I could not find anything official regarding UNC paths with forward slashes in the Win32 documentation for paths (http://msdn.microsoft.com/en-us/library/windows/desktop/aa365247%28v=vs.85%29.aspx)

So I did a few tests: They work on the command line and CreateFile works fine, too. Surprisingly they don't work in Explorer. Extended file names OTOH, e.g. names starting with \\?\, do not support forward slashes.

I guess even if they work officially, they are rather unusual. I suspect they'll most likely lead to some practical problems with 3rd-party apps or libs which do not share the same flexibility as the Windows kernel.

If not even the Windows Explorer can handle them properly...

#7 Updated by Benoit Daloze 12 months ago

Thanks for both of your answers!

Indeed I think I met the fact it did not work in Explorer, but if it works with the standard API that is what matters.
I would definitely go with something like nobu's solution then (the conversion should also be done when loading from JSON/yaml/etc, ideally every time @path is changed).

#8 Updated by Daniel Rikowski 12 months ago

No problem, I'm glad to help.

To be honest I'm a little concerned that my report of such a minor problem might result in such a drastic change.

But lets hope for the best :)

#9 Updated by Akira Tanaka 12 months ago

I like normalization in cleanpath only.

I feel normalization in initialize is too drastic.

I feel mixed separators is not clean, so it is natural to normalize in cleanpath.

#10 Updated by Nobuyoshi Nakada 12 months ago

Updated.

    pathname.rb: separators

    * ext/pathname/lib/pathname.rb (cleanpath_aggressive): make all
      separators File::SEPARATOR from File::ALT_SEPARATOR.
       [Bug #9618]

    * ext/pathname/lib/pathname.rb (cleanpath_conservative): ditto.

    Modified   ext/pathname/lib/pathname.rb
diff --git a/ext/pathname/lib/pathname.rb b/ext/pathname/lib/pathname.rb
index 20c92e2..e64432e 100644
--- a/ext/pathname/lib/pathname.rb
+++ b/ext/pathname/lib/pathname.rb
@@ -113,6 +113,7 @@ class Pathname
         end
       end
     end
+    pre.tr!(File::ALT_SEPARATOR, File::SEPARATOR) if File::ALT_SEPARATOR
     if /#{SEPARATOR_PAT}/o =~ File.basename(pre)
       names.shift while names[0] == '..'
     end
@@ -161,6 +162,7 @@ class Pathname
       pre, base = r
       names.unshift base if base != '.'
     end
+    pre.tr!(File::ALT_SEPARATOR, File::SEPARATOR) if File::ALT_SEPARATOR
     if /#{SEPARATOR_PAT}/o =~ File.basename(pre)
       names.shift while names[0] == '..'
     end
    Modified   test/pathname/test_pathname.rb
diff --git a/test/pathname/test_pathname.rb b/test/pathname/test_pathname.rb
index ed79b5b..a74dad1 100644
--- a/test/pathname/test_pathname.rb
+++ b/test/pathname/test_pathname.rb
@@ -88,6 +88,10 @@ class TestPathname < Test::Unit::TestCase
     defassert(:cleanpath_aggressive, '/',       '///a/../..')
   end

+  if DOSISH
+    defassert(:cleanpath_aggressive, 'c:/foo/bar', 'c:\\foo\\bar')
+  end
+
   def cleanpath_conservative(path)
     Pathname.new(path).cleanpath(true).to_s
   end
@@ -124,6 +128,10 @@ class TestPathname < Test::Unit::TestCase
   defassert(:cleanpath_conservative, '/a',     '/../.././../a')
   defassert(:cleanpath_conservative, 'a/b/../../../../c/../d', 'a/b/../../../../c/../d')

+  if DOSISH
+    defassert(:cleanpath_conservative, 'c:/foo/bar', 'c:\\foo\\bar')
+  end
+
   if DOSISH_UNC
     defassert(:cleanpath_conservative, '//',     '//')
   else

#11 Updated by Akira Tanaka 10 months ago

  • Status changed from Open to Closed
  • % Done changed from 0 to 100

Applied in changeset r45827.


  • ext/pathname/lib/pathname.rb (cleanpath_aggressive): make all
    separators File::SEPARATOR from File::ALT_SEPARATOR.
    Reported by Daniel Rikowski.
    Fixed by Nobuyoshi Nakada. [Bug #9618]

  • ext/pathname/lib/pathname.rb (cleanpath_conservative): ditto.

#12 Updated by Usaku NAKAMURA 8 months ago

  • Backport changed from 1.9.3: UNKNOWN, 2.0.0: UNKNOWN, 2.1: UNKNOWN to 1.9.3: REQUIRED, 2.0.0: UNKNOWN, 2.1: REQUIRED

#13 Updated by Usaku NAKAMURA 8 months ago

  • Backport changed from 1.9.3: REQUIRED, 2.0.0: UNKNOWN, 2.1: REQUIRED to 2.0.0: REQUIRED, 2.1: REQUIRED

#14 Updated by Tomoyuki Chikanaga 7 months ago

  • Backport changed from 2.0.0: REQUIRED, 2.1: REQUIRED to 2.0.0: REQUIRED, 2.1: DONE

Backported into ruby_2_1 branch at r46911.

#15 Updated by Usaku NAKAMURA 6 months ago

  • Backport changed from 2.0.0: REQUIRED, 2.1: DONE to 2.0.0: DONE, 2.1: DONE

backported into ruby_2_0_0 at r47337.

Also available in: Atom PDF