Project

General

Profile

Actions

Bug #9618

closed

Pathname#cleanpath creates mixed path separators

Added by daniel-rikowski (Daniel Rikowski) about 10 years ago. Updated over 9 years ago.

Status:
Closed
Assignee:
Target version:
-
ruby -v:
ruby 2.0.0p451 (2014-02-24) [i386-mingw32]
[ruby-core:61402]

Description

When using Pathname#cleanpath with a Windows path the resulting path contains a mixture of slashes and backslashes.

require 'pathname'
path = Pathname.new('c:\projects\ruby\bug\test.rb')
path.to_s               # => "c:\\projects\\ruby\\bug\\test.rb"
path.cleanpath.to_s     # => "c:\\projects/ruby/bug/test.rb"

I'd expect cleanpath to use the same path separator for all path segments. The problem doesn't happen on non-Windows platforms because there backslashes are not detected as path separators.

The problem is that the first path segment is added verbatim and only subsequent segments are joined by File::join.

Personally I'd prefer it to use File::SEPARATOR only, regardless of any original separator(s). That way it would blend with the current 'normalizing' behaviour of cleanpath, which then could be also used to normalize any existing separator weirdness and - for example - make a path compatible with Dir.glob (which can't use backslashes)

Updated by Eregon (Benoit Daloze) about 10 years ago

I agree on using / only in Pathname (converting in #initialize).

Updated by nobu (Nobuyoshi Nakada) about 10 years ago

Agree.

diff --git a/ext/pathname/pathname.c b/ext/pathname/pathname.c
index 3db97fc..a68ecd2 100644
--- a/ext/pathname/pathname.c
+++ b/ext/pathname/pathname.c
@@ -17,6 +17,12 @@ get_strpath(VALUE obj)
 static void
 set_strpath(VALUE obj, VALUE val)
 {
+    VALUE sep[2];
+    sep[0] = rb_const_get(rb_cFile, rb_intern("ALT_SEPARATOR"));
+    if (!NIL_P(sep[0])) {
+	sep[1] = rb_const_get(rb_cFile, rb_intern("SEPARATOR"));
+	rb_funcallv(val, rb_intern("tr!"), 2, sep);
+    }
     rb_ivar_set(obj, id_at_path, val);
 }
 

Updated by daniel-rikowski (Daniel Rikowski) about 10 years ago

I'm not sure if converting the separators in #initialize is the best way to go. I can see the advantages, but I can also imagine existing programs breaking because of that.

Personally I'd prefer a more conservative approach which would remove ALT_SEPARATOR in #cleanpath only.

That way #initialize would gain a new (minor) feature as opposed to Pathname losing a (major) feature. (i.e. the possibility to store file names in the native format of the current OS)

Updated by Eregon (Benoit Daloze) about 10 years ago

There might be the special case of UNC paths, does anyone know if UNC paths with forward slashes work on Windows?

Updated by daniel-rikowski (Daniel Rikowski) about 10 years ago

I could not find anything official regarding UNC paths with forward slashes in the Win32 documentation for paths (http://msdn.microsoft.com/en-us/library/windows/desktop/aa365247%28v=vs.85%29.aspx)

So I did a few tests: They work on the command line and CreateFile works fine, too. Surprisingly they don't work in Explorer. Extended file names OTOH, e.g. names starting with \\?\, do not support forward slashes.

I guess even if they work officially, they are rather unusual. I suspect they'll most likely lead to some practical problems with 3rd-party apps or libs which do not share the same flexibility as the Windows kernel.

If not even the Windows Explorer can handle them properly...

Updated by Eregon (Benoit Daloze) about 10 years ago

Thanks for both of your answers!

Indeed I think I met the fact it did not work in Explorer, but if it works with the standard API that is what matters.
I would definitely go with something like nobu's solution then (the conversion should also be done when loading from JSON/yaml/etc, ideally every time @path is changed).

Updated by daniel-rikowski (Daniel Rikowski) about 10 years ago

No problem, I'm glad to help.

To be honest I'm a little concerned that my report of such a minor problem might result in such a drastic change.

But lets hope for the best :)

Updated by akr (Akira Tanaka) about 10 years ago

I like normalization in cleanpath only.

I feel normalization in initialize is too drastic.

I feel mixed separators is not clean, so it is natural to normalize in cleanpath.

Updated by nobu (Nobuyoshi Nakada) about 10 years ago

Updated.

    pathname.rb: separators
    
    * ext/pathname/lib/pathname.rb (cleanpath_aggressive): make all
      separators File::SEPARATOR from File::ALT_SEPARATOR.
      [ruby-core:61402] [Bug #9618]
    
    * ext/pathname/lib/pathname.rb (cleanpath_conservative): ditto.

	Modified   ext/pathname/lib/pathname.rb
diff --git a/ext/pathname/lib/pathname.rb b/ext/pathname/lib/pathname.rb
index 20c92e2..e64432e 100644
--- a/ext/pathname/lib/pathname.rb
+++ b/ext/pathname/lib/pathname.rb
@@ -113,6 +113,7 @@ class Pathname
         end
       end
     end
+    pre.tr!(File::ALT_SEPARATOR, File::SEPARATOR) if File::ALT_SEPARATOR
     if /#{SEPARATOR_PAT}/o =~ File.basename(pre)
       names.shift while names[0] == '..'
     end
@@ -161,6 +162,7 @@ class Pathname
       pre, base = r
       names.unshift base if base != '.'
     end
+    pre.tr!(File::ALT_SEPARATOR, File::SEPARATOR) if File::ALT_SEPARATOR
     if /#{SEPARATOR_PAT}/o =~ File.basename(pre)
       names.shift while names[0] == '..'
     end
	Modified   test/pathname/test_pathname.rb
diff --git a/test/pathname/test_pathname.rb b/test/pathname/test_pathname.rb
index ed79b5b..a74dad1 100644
--- a/test/pathname/test_pathname.rb
+++ b/test/pathname/test_pathname.rb
@@ -88,6 +88,10 @@ class TestPathname < Test::Unit::TestCase
     defassert(:cleanpath_aggressive, '/',       '///a/../..')
   end
 
+  if DOSISH
+    defassert(:cleanpath_aggressive, 'c:/foo/bar', 'c:\\foo\\bar')
+  end
+
   def cleanpath_conservative(path)
     Pathname.new(path).cleanpath(true).to_s
   end
@@ -124,6 +128,10 @@ class TestPathname < Test::Unit::TestCase
   defassert(:cleanpath_conservative, '/a',     '/../.././../a')
   defassert(:cleanpath_conservative, 'a/b/../../../../c/../d', 'a/b/../../../../c/../d')
 
+  if DOSISH
+    defassert(:cleanpath_conservative, 'c:/foo/bar', 'c:\\foo\\bar')
+  end
+
   if DOSISH_UNC
     defassert(:cleanpath_conservative, '//',     '//')
   else

Updated by akr (Akira Tanaka) almost 10 years ago

  • Status changed from Open to Closed
  • % Done changed from 0 to 100

Applied in changeset r45827.


  • ext/pathname/lib/pathname.rb (cleanpath_aggressive): make all
    separators File::SEPARATOR from File::ALT_SEPARATOR.
    Reported by Daniel Rikowski.
    Fixed by Nobuyoshi Nakada. [Bug #9618]

  • ext/pathname/lib/pathname.rb (cleanpath_conservative): ditto.

Updated by usa (Usaku NAKAMURA) almost 10 years ago

  • Backport changed from 1.9.3: UNKNOWN, 2.0.0: UNKNOWN, 2.1: UNKNOWN to 1.9.3: REQUIRED, 2.0.0: UNKNOWN, 2.1: REQUIRED

Updated by usa (Usaku NAKAMURA) almost 10 years ago

  • Backport changed from 1.9.3: REQUIRED, 2.0.0: UNKNOWN, 2.1: REQUIRED to 2.0.0: REQUIRED, 2.1: REQUIRED

Updated by nagachika (Tomoyuki Chikanaga) over 9 years ago

  • Backport changed from 2.0.0: REQUIRED, 2.1: REQUIRED to 2.0.0: REQUIRED, 2.1: DONE

Backported into ruby_2_1 branch at r46911.

Updated by usa (Usaku NAKAMURA) over 9 years ago

  • Backport changed from 2.0.0: REQUIRED, 2.1: DONE to 2.0.0: DONE, 2.1: DONE

backported into ruby_2_0_0 at r47337.

Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0