https://bugs.ruby-lang.org/https://bugs.ruby-lang.org/favicon.ico?17113305112009-07-16T16:10:38ZRuby Issue Tracking SystemRuby master - Bug #1685: Some windows unicode path issues remainhttps://bugs.ruby-lang.org/issues/1685?journal_id=47562009-07-16T16:10:38Zyugui (Yuki Sonoda)yugui@yugui.jp
<ul><li><strong>Status</strong> changed from <i>Open</i> to <i>Assigned</i></li><li><strong>Assignee</strong> set to <i>usa (Usaku NAKAMURA)</i></li></ul><p>=begin</p>
<p>=end</p> Ruby master - Bug #1685: Some windows unicode path issues remainhttps://bugs.ruby-lang.org/issues/1685?journal_id=52452009-08-12T15:48:45Zusa (Usaku NAKAMURA)usa@garbagecollect.jp
<ul><li><strong>Priority</strong> changed from <i>Normal</i> to <i>5</i></li></ul><p>=begin</p>
<p>=end</p> Ruby master - Bug #1685: Some windows unicode path issues remainhttps://bugs.ruby-lang.org/issues/1685?journal_id=77672010-01-22T22:01:37Zvo.x (Vit Ondruch)v.ondruch@tiscali.cz
<ul></ul><p>=begin<br>
Is there any progress regarding this issue(s)?<br>
=end</p> Ruby master - Bug #1685: Some windows unicode path issues remainhttps://bugs.ruby-lang.org/issues/1685?journal_id=92942010-03-25T10:13:59Zspatulasnout (B Kelly)billk@cts.com
<ul><li><strong>File</strong> <a href="/attachments/909">spatulasnout-unicode-mkdir-diffs.txt</a> <a class="icon-only icon-download" title="Download" href="/attachments/download/909/spatulasnout-unicode-mkdir-diffs.txt">spatulasnout-unicode-mkdir-diffs.txt</a> added</li><li><strong>File</strong> <a href="/attachments/910">test_io_unicode_paths.rb</a> <a class="icon-only icon-download" title="Download" href="/attachments/download/910/test_io_unicode_paths.rb">test_io_unicode_paths.rb</a> added</li></ul><p>=begin<br>
Hi,</p>
<p>I'll be needing win32 unicode path support for my current project, so I would like to try to tackle the remaining issues.</p>
<p>I started with a relatively easy one, Dir.mkdir</p>
<p>For Dir.mkdir, I took an approach similar to what was already in place for rb_sysopen(), which is that it tries to call w32_conv_to_utf16() on the path, and if it succeeds calls the new rb_w32_wmkdir() with the wide path; otherwise it falls back to calling the old rb_w32_mkdir().</p>
<p>Attached files should include the diffs, and a new file adding a bootstrap test for unicode paths. (The tests currently fail, because they need a working unicode stat and unlink in order to function.)</p>
<p>I'm planning to attempt File.stat next, but I have some questions about it so I'll post separately.</p>
<p>Regards,</p>
<p>Bill</p>
<p>=end</p> Ruby master - Bug #1685: Some windows unicode path issues remainhttps://bugs.ruby-lang.org/issues/1685?journal_id=93112010-03-25T17:01:32Zvo.x (Vit Ondruch)v.ondruch@tiscali.cz
<ul></ul><p>=begin<br>
Hello Bill,</p>
<p>Are you aware of win32_unicode_branch? Its not up-to-date as far as I know, but there is lot of Unicode functionality covered. There is missing mainly Dir.glob functionality.</p>
<p>Vit<br>
=end</p> Ruby master - Bug #1685: Some windows unicode path issues remainhttps://bugs.ruby-lang.org/issues/1685?journal_id=93132010-03-25T19:10:35Zspatulasnout (B Kelly)billk@cts.com
<ul></ul><p>=begin<br>
Hi Vit,</p>
<p>Thanks. Wow. Good to know.</p>
<p>win32/win32.c:rb_w32_uchown(const char *path, int owner, int group)<br>
win32/win32.c:rb_w32_ulink(const char *from, const char *to)<br>
win32/win32.c:rb_w32_urename(const char *from, const char *to)<br>
win32/win32.c:rb_w32_ustati64(const char *path, struct stati64 *st)<br>
win32/win32.c:rb_w32_uaccess(const char *path, int mode)<br>
win32/win32.c:rb_w32_uopen(const char *file, int oflag, ...)<br>
win32/win32.c:rb_w32_uutime(const char *path, const struct utimbuf *times)<br>
win32/win32.c:rb_w32_utime(const char *path, const struct utimbuf *times)<br>
win32/win32.c:rb_w32_uchdir(const char *path)<br>
win32/win32.c:rb_w32_umkdir(const char *path, int mode)<br>
win32/win32.c:rb_w32_urmdir(const char *path)<br>
win32/win32.c:rb_w32_uunlink(const char *path)<br>
win32/win32.c:rb_w32_unlink(const char *path)<br>
win32/win32.c:rb_w32_uchmod(const char *path, int mode)</p>
<p>And it looks like a much cleaner implementation than what<br>
is currently in the 1.9.2 trunk. No more 'wchar' in<br>
sysopen_struct, no more #ifdef _WIN32 surrounding<br>
w32_conv_to_utf16 logic, just some defines at the top:</p>
<p>dir.c:#define chdir(p) rb_w32_uchdir(p)<br>
dir.c:#define mkdir(p, m) rb_w32_umkdir(p, m)<br>
dir.c:#define rmdir(p) rb_w32_urmdir(p)<br>
file.c:#define STAT(p, s) rb_w32_ustati64(p, s)<br>
file.c:#define lstat(p, s) rb_w32_ustati64(p, s)<br>
file.c:#define access(p, m) rb_w32_uaccess(p, m)<br>
file.c:#define chmod(p, m) rb_w32_uchmod(p, m)<br>
file.c:#define chown(p, o, g) rb_w32_uchown(p, o, g)<br>
file.c:#define utime(p, t) rb_w32_uutime(p, t)<br>
file.c:#define link(f, t) rb_w32_ulink(f, t)<br>
file.c:#define unlink(p) rb_w32_uunlink(p)<br>
file.c:#define rename(f, t) rb_w32_urename(f, t)<br>
io.c:#define open rb_w32_uopen</p>
<p>I wonder if there is a reason this should not be merged<br>
into trunk ASAP?</p>
<p>Regards,</p>
<p>Bill</p>
<p>=end</p> Ruby master - Bug #1685: Some windows unicode path issues remainhttps://bugs.ruby-lang.org/issues/1685?journal_id=93192010-03-25T22:28:21Zusa (Usaku NAKAMURA)usa@garbagecollect.jp
<ul></ul><p>=begin<br>
Hello,</p>
<p>In message "<a href="https://blade.ruby-lang.org/ruby-core/28979">[ruby-core:28979]</a> [Bug <a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: Some windows unicode path issues remain (Closed)" href="https://bugs.ruby-lang.org/issues/1685">#1685</a>] Some windows unicode path issues remain"<br>
on Mar.25,2010 19:10:35, <a href="mailto:redmine@ruby-lang.org" class="email">redmine@ruby-lang.org</a> wrote:</p>
<blockquote>
<p>I wonder if there is a reason this should not be merged<br>
into trunk ASAP?</p>
</blockquote>
<p>Because I'm too busy to test this branch well :(</p>
<p>Endoh-san says that the feature freeze is March 31.<br>
Then, it is necessary to complete merging it until then,<br>
if we want to include it in 1.9.2 release...</p>
<p>win32-unicode-branch has not contained the globbing features<br>
yet, as Vit pointed in <a href="https://blade.ruby-lang.org/ruby-core/28977">[ruby-core:28977]</a> (thank you, Vit).<br>
However, because it relates to the command line interpretation,<br>
it might be difficult to implement until March 31.<br>
Should we wait until all functions are covered, or merge the<br>
current one?</p>
<p>Summary:<br>
(1) need the decision whether merging it or not<br>
(2) need testers :)<br>
(3) need the worker(s) to make the patch to trunk</p>
<a name="Regards"></a>
<h2 >Regards,<a href="#Regards" class="wiki-anchor">¶</a></h2>
<p>U.Nakamura <a href="mailto:usa@garbagecollect.jp" class="email">usa@garbagecollect.jp</a></p>
<p>=end</p> Ruby master - Bug #1685: Some windows unicode path issues remainhttps://bugs.ruby-lang.org/issues/1685?journal_id=93212010-03-25T22:57:00Zvo.x (Vit Ondruch)v.ondruch@tiscali.cz
<ul></ul><p>=begin<br>
For me, it would be helpful to merge what we have now. I am not aware of any problematic parts with methods which are already implemented. However, I am aware that this can lead in confusion when not everything will work with unicode :(</p>
<p>Vit<br>
=end</p> Ruby master - Bug #1685: Some windows unicode path issues remainhttps://bugs.ruby-lang.org/issues/1685?journal_id=94752010-03-28T16:51:39Zspatulasnout (B Kelly)billk@cts.com
<ul></ul><p>=begin<br>
Hi,</p>
<p>U.Nakamura wrote:</p>
<blockquote>
<p>Hello,</p>
<p>In message "<a href="https://blade.ruby-lang.org/ruby-core/28979">[ruby-core:28979]</a> [Bug <a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: Some windows unicode path issues remain (Closed)" href="https://bugs.ruby-lang.org/issues/1685">#1685</a>] Some windows unicode path issues remain"<br>
on Mar.25,2010 19:10:35, <a href="mailto:redmine@ruby-lang.org" class="email">redmine@ruby-lang.org</a> wrote:</p>
<blockquote>
<p>I wonder if there is a reason this should not be merged<br>
into trunk ASAP?</p>
</blockquote>
<p>Because I'm too busy to test this branch well :(</p>
<p>Endoh-san says that the feature freeze is March 31.<br>
Then, it is necessary to complete merging it until then,<br>
if we want to include it in 1.9.2 release...</p>
<p>win32-unicode-branch has not contained the globbing features<br>
yet, as Vit pointed in <a href="https://blade.ruby-lang.org/ruby-core/28977">[ruby-core:28977]</a> (thank you, Vit).<br>
However, because it relates to the command line interpretation,<br>
it might be difficult to implement until March 31.</p>
</blockquote>
<p>I understand how this might be considered a 'feature', but<br>
I think it is also possible to consider it a bug fix.</p>
<p>1.9.1 was supposed to support unicode path on win32, but<br>
this was deferred to 1.9.2.</p>
<p>Nevertheless, I quote matz from November, 2008:</p>
<p>Yukihiro Matsumoto wrote:</p>
<blockquote>
<p>Hi,</p>
<p>In message "Re: <a href="https://blade.ruby-lang.org/ruby-core/20109">[ruby-core:20109]</a> Re: 1.9, encoding & win32 wide char support"<br>
on Wed, 26 Nov 2008 12:26:53 +0900, "Bill Kelly" <a href="mailto:billk@cts.com" class="email">billk@cts.com</a> writes:</p>
<p>|> Does anyone have information as to the current status of<br>
|> adding Unicode-savvy path handling to 1.9 ruby?<br>
|<br>
|Ugh. Sorry, I mean of course: Unicode-savvy path handling<br>
|on <em>win32</em> ruby 1.9.</p>
<p>Every path encoding is UTF-8 and converted to UTF-16 internally. If<br>
there's something still use *A functions, it will eventually replaced<br>
by *W functions. In short, if you're using UTF-8 for your program<br>
encoding, you should not see any problem (if you do, it's a bug).</p>
<pre><code> matz.
</code></pre>
</blockquote>
<p>I don't know if matz has changed his mind, but; personally I would<br>
like to consider it a bug that ruby 1.9.x fails for unicode paths<br>
on windows.</p>
<blockquote>
<p>Should we wait until all functions are covered, or merge the<br>
current one?</p>
<p>Summary:<br>
(1) need the decision whether merging it or not<br>
(2) need testers :)<br>
(3) need the worker(s) to make the patch to trunk</p>
</blockquote>
<p>(1) Please, yes. Let us merge. 93.75% is better than current 6.25% coverage.<br>
(2) I hope to contribute unicode_path unit-tests. (such as in bootstraptest/)<br>
(3) I would like to contribute to the patch if my efforts can be useful.<br>
(diffs on io.c, file.c, and dir.c look pretty straightforward.)<br>
(diffs on win32/win32.c look more difficult, but I can attempt.)</p>
<p>Regards,</p>
<p>Bill</p>
<p>=end</p> Ruby master - Bug #1685: Some windows unicode path issues remainhttps://bugs.ruby-lang.org/issues/1685?journal_id=94832010-03-28T23:57:52Zusa (Usaku NAKAMURA)usa@garbagecollect.jp
<ul></ul><p>=begin<br>
Hello,</p>
<p>In message "<a href="https://blade.ruby-lang.org/ruby-core/29082">[ruby-core:29082]</a> Re: [Bug <a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: Some windows unicode path issues remain (Closed)" href="https://bugs.ruby-lang.org/issues/1685">#1685</a>] Some windows unicode path issues remain"<br>
on Mar.28,2010 16:51:26, <a href="mailto:billk@cts.com" class="email">billk@cts.com</a> wrote:</p>
<blockquote>
<p>I understand how this might be considered a 'feature', but<br>
I think it is also possible to consider it a bug fix.</p>
</blockquote>
<p>Hmm, it has a point in it.<br>
The branch manager should judge whether this change is a bug fix or<br>
feature change.<br>
How do you think, Yugui-san?</p>
<blockquote>
<blockquote>
<p>Summary:<br>
(1) need the decision whether merging it or not<br>
(2) need testers :)<br>
(3) need the worker(s) to make the patch to trunk</p>
</blockquote>
<p>(1) Please, yes. Let us merge. 93.75% is better than current 6.25% coverage.<br>
(2) I hope to contribute unicode_path unit-tests. (such as in bootstraptest/)<br>
(3) I would like to contribute to the patch if my efforts can be useful.<br>
(diffs on io.c, file.c, and dir.c look pretty straightforward.)<br>
(diffs on win32/win32.c look more difficult, but I can attempt.)</p>
</blockquote>
<p>I'm very glad to hear your offer of cooperation.<br>
Thank you!</p>
<a name="Regards"></a>
<h2 >Regards,<a href="#Regards" class="wiki-anchor">¶</a></h2>
<p>U.Nakamura <a href="mailto:usa@garbagecollect.jp" class="email">usa@garbagecollect.jp</a></p>
<p>=end</p> Ruby master - Bug #1685: Some windows unicode path issues remainhttps://bugs.ruby-lang.org/issues/1685?journal_id=96242010-04-01T03:16:10Zspatulasnout (B Kelly)billk@cts.com
<ul></ul><p>=begin<br>
U.Nakamura wrote:</p>
<blockquote>
<p>Hello,</p>
<p>In message "<a href="https://blade.ruby-lang.org/ruby-core/29082">[ruby-core:29082]</a> Re: [Bug <a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: Some windows unicode path issues remain (Closed)" href="https://bugs.ruby-lang.org/issues/1685">#1685</a>] Some windows unicode path issues remain"<br>
on Mar.28,2010 16:51:26, <a href="mailto:billk@cts.com" class="email">billk@cts.com</a> wrote:</p>
<blockquote>
<p>I understand how this might be considered a 'feature', but<br>
I think it is also possible to consider it a bug fix.</p>
</blockquote>
<p>Hmm, it has a point in it.<br>
The branch manager should judge whether this change is a bug fix or<br>
feature change.<br>
How do you think, Yugui-san?</p>
</blockquote>
<p>Any word?</p>
<p>Regards,</p>
<p>Bill</p>
<p>=end</p> Ruby master - Bug #1685: Some windows unicode path issues remainhttps://bugs.ruby-lang.org/issues/1685?journal_id=96362010-04-01T10:30:30Zyugui (Yuki Sonoda)yugui@yugui.jp
<ul></ul><p>=begin</p>
<blockquote>
<p>The branch manager should judge whether this change is a bug fix or<br>
feature change.<br>
How do you think, Yugui-san?</p>
</blockquote>
<p>It's a bug fix.<br>
=end</p> Ruby master - Bug #1685: Some windows unicode path issues remainhttps://bugs.ruby-lang.org/issues/1685?journal_id=96452010-04-01T14:52:49Zspatulasnout (B Kelly)billk@cts.com
<ul></ul><p>=begin<br>
Yuki Sonoda wrote:</p>
<blockquote>
<p>Issue <a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: Some windows unicode path issues remain (Closed)" href="https://bugs.ruby-lang.org/issues/1685">#1685</a> has been updated by Yuki Sonoda.</p>
<blockquote>
<p>The branch manager should judge whether this change is a bug fix or<br>
feature change.<br>
How do you think, Yugui-san?</p>
</blockquote>
<p>It's a bug fix.</p>
</blockquote>
<p>Wonderful news!</p>
<p>Thank you,</p>
<p>Bill</p>
<p>=end</p> Ruby master - Bug #1685: Some windows unicode path issues remainhttps://bugs.ruby-lang.org/issues/1685?journal_id=106662010-04-30T08:12:48Zspatulasnout (B Kelly)billk@cts.com
<ul></ul><p>=begin<br>
Hi,</p>
<p>Bill Kelly wrote:</p>
<blockquote>
<p>Yuki Sonoda wrote:</p>
<blockquote>
<p>Issue <a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: Some windows unicode path issues remain (Closed)" href="https://bugs.ruby-lang.org/issues/1685">#1685</a> has been updated by Yuki Sonoda.</p>
<blockquote>
<p>The branch manager should judge whether this change is a bug fix or<br>
feature change.<br>
How do you think, Yugui-san?</p>
</blockquote>
<p>It's a bug fix.</p>
</blockquote>
<p>Wonderful news!</p>
</blockquote>
<p>In order to avoid duplication of effort, I wanted to inquire<br>
whether anyone else may currently be working on Windows<br>
Unicode related code?</p>
<p>U.Nakamura wrote:</p>
<blockquote>
<p>(2) need testers :)<br>
(3) need the worker(s) to make the patch to trunk</p>
</blockquote>
<p>If there is no conflict with others' work, I would like to<br>
attempt merging the win32-unicode branch into trunk within<br>
the next week or two.</p>
<p>Regards,</p>
<p>Bill</p>
<p>=end</p> Ruby master - Bug #1685: Some windows unicode path issues remainhttps://bugs.ruby-lang.org/issues/1685?journal_id=107842010-05-05T03:49:19Zusa (Usaku NAKAMURA)usa@garbagecollect.jp
<ul></ul><p>=begin<br>
Hello,</p>
<p>In message "<a href="https://blade.ruby-lang.org/ruby-core/29892">[ruby-core:29892]</a> Re: [Bug <a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: Some windows unicode path issues remain (Closed)" href="https://bugs.ruby-lang.org/issues/1685">#1685</a>] Some windows unicode path issues remain"<br>
on Apr.30,2010 08:12:33, <a href="mailto:billk@cts.com" class="email">billk@cts.com</a> wrote:<br>
| In order to avoid duplication of effort, I wanted to inquire<br>
| whether anyone else may currently be working on Windows<br>
| Unicode related code?<br>
|<br>
|<br>
| U.Nakamura wrote:<br>
| ><br>
| > (2) need testers :)<br>
| > (3) need the worker(s) to make the patch to trunk<br>
|<br>
|<br>
| If there is no conflict with others' work, I would like to<br>
| attempt merging the win32-unicode branch into trunk within<br>
| the next week or two.</p>
<p>Ah, I've merged most parts of win32-unicode-test branch because<br>
the time limit of code freeze (Apr.30) has come.</p>
<a name="See-r27570"></a>
<h1 >See r27570<a href="#See-r27570" class="wiki-anchor">¶</a></h1>
<p>Of course, test cases and bug reports are welcomed.</p>
<a name="Regards"></a>
<h2 >Regards<a href="#Regards" class="wiki-anchor">¶</a></h2>
<p>U.Nakamura <a href="mailto:usa@garbagecollect.jp" class="email">usa@garbagecollect.jp</a></p>
<p>=end</p> Ruby master - Bug #1685: Some windows unicode path issues remainhttps://bugs.ruby-lang.org/issues/1685?journal_id=107922010-05-05T15:35:27Zspatulasnout (B Kelly)billk@cts.com
<ul></ul><p>=begin<br>
U.Nakamura wrote:</p>
<blockquote>
<p>Hello,</p>
<p>In message "<a href="https://blade.ruby-lang.org/ruby-core/29892">[ruby-core:29892]</a> Re: [Bug <a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: Some windows unicode path issues remain (Closed)" href="https://bugs.ruby-lang.org/issues/1685">#1685</a>] Some windows unicode path issues remain"<br>
on Apr.30,2010 08:12:33, <a href="mailto:billk@cts.com" class="email">billk@cts.com</a> wrote:<br>
|<br>
| If there is no conflict with others' work, I would like to<br>
| attempt merging the win32-unicode branch into trunk within<br>
| the next week or two.</p>
<p>Ah, I've merged most parts of win32-unicode-test branch because<br>
the time limit of code freeze (Apr.30) has come.</p>
<a name="See-r27570"></a>
<h1 >See r27570<a href="#See-r27570" class="wiki-anchor">¶</a></h1>
</blockquote>
<p>Oh! Thank you very much!</p>
<p>(I had thought the code freeze applied to new features, rather<br>
than bug fixes.)</p>
<blockquote>
<p>Of course, test cases and bug reports are welcomed.</p>
</blockquote>
<p>My initial attempt at a bootstraptest for unicode path<br>
support is failing.</p>
<p>It is incomplete, but I uploaded the current version:</p>
<p><a href="http://redmine.ruby-lang.org/attachments/download/910" class="external">http://redmine.ruby-lang.org/attachments/download/910</a></p>
<p>It is failing at:</p>
<p>DNAME_CHINESE = "\u52ec\u52ee\u52f1\u52f2"<br>
Dir.mkdir DNAME_CHINESE<br>
test(?d, DNAME_CHINESE) or raise "test ?d fail"</p>
<p>It seems rb_stat in file.c calls stat(), but stat does<br>
not map to the unicode version.</p>
<p>win32.h:</p>
<p>#define stat(path,st) rb_w32_stat(path,st)</p>
<p>file.c:</p>
<p>static int<br>
rb_stat(VALUE file, struct stat *st)<br>
{<br>
VALUE tmp;</p>
<pre><code> rb_secure(2);
tmp = rb_check_convert_type(file, T_FILE, "IO", "to_io");
if (!NIL_P(tmp)) {
rb_io_t *fptr;
GetOpenFile(tmp, fptr);
return fstat(fptr->fd, st);
}
FilePathValue(file);
file = rb_str_encode_ospath(file);
return stat(StringValueCStr(file), st);
</code></pre>
<p>}</p>
<p>Regards,</p>
<p>Bill</p>
<p>=end</p> Ruby master - Bug #1685: Some windows unicode path issues remainhttps://bugs.ruby-lang.org/issues/1685?journal_id=107932010-05-05T15:57:08Zusa (Usaku NAKAMURA)usa@garbagecollect.jp
<ul></ul><p>=begin<br>
Hello,</p>
<p>In message "<a href="https://blade.ruby-lang.org/ruby-core/30012">[ruby-core:30012]</a> Re: [Bug <a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: Some windows unicode path issues remain (Closed)" href="https://bugs.ruby-lang.org/issues/1685">#1685</a>] Some windows unicode path issues remain"<br>
on May.05,2010 15:35:11, <a href="mailto:billk@cts.com" class="email">billk@cts.com</a> wrote:<br>
| My initial attempt at a bootstraptest for unicode path<br>
| support is failing.<br>
|<br>
| It is incomplete, but I uploaded the current version:<br>
|<br>
| <a href="http://redmine.ruby-lang.org/attachments/download/910" class="external">http://redmine.ruby-lang.org/attachments/download/910</a><br>
|<br>
| It is failing at:<br>
|<br>
| DNAME_CHINESE = "\u52ec\u52ee\u52f1\u52f2"<br>
| Dir.mkdir DNAME_CHINESE<br>
| test(?d, DNAME_CHINESE) or raise "test ?d fail"<br>
|<br>
|<br>
| It seems rb_stat in file.c calls stat(), but stat does<br>
| not map to the unicode version.</p>
<p>Oops, thank you!</p>
<a name="Regards"></a>
<h2 >Regards<a href="#Regards" class="wiki-anchor">¶</a></h2>
<p>U.Nakamura <a href="mailto:usa@garbagecollect.jp" class="email">usa@garbagecollect.jp</a></p>
<p>=end</p> Ruby master - Bug #1685: Some windows unicode path issues remainhttps://bugs.ruby-lang.org/issues/1685?journal_id=108252010-05-06T19:39:41Zspatulasnout (B Kelly)billk@cts.com
<ul></ul><p>=begin<br>
U.Nakamura wrote:</p>
<blockquote>
<p>In message "<a href="https://blade.ruby-lang.org/ruby-core/30012">[ruby-core:30012]</a> Re: [Bug <a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: Some windows unicode path issues remain (Closed)" href="https://bugs.ruby-lang.org/issues/1685">#1685</a>] Some windows unicode path issues remain"<br>
on May.05,2010 15:35:11, <a href="mailto:billk@cts.com" class="email">billk@cts.com</a> wrote:<br>
|<br>
| It seems rb_stat in file.c calls stat(), but stat does<br>
| not map to the unicode version.</p>
<p>Oops, thank you!</p>
</blockquote>
<p>Thanks, the test gets much further now.</p>
<p>It now fails at the last line:</p>
<p>Dir.chdir DNAME_CHINESE<br>
cwd = Dir.pwd<br>
( cwd[(-DNAME_CHINESE.length)..-1] == DNAME_CHINESE ) or raise "cwd check fail"</p>
<p>Currently there was only rb_w32_getcwd. I have added a unicode<br>
rb_w32_ugetcwd:</p>
<pre><code>Index: include/ruby/win32.h
===================================================================
--- include/ruby/win32.h (revision 27644)
+++ include/ruby/win32.h (working copy)
@@ -254,6 +254,7 @@
extern struct servent *WSAAPI rb_w32_getservbyport(int, const char *);
extern int rb_w32_socketpair(int, int, int, int *);
extern char * rb_w32_getcwd(char *, int);
+extern char * rb_w32_ugetcwd(char *, int);
extern char * rb_w32_getenv(const char *);
extern int rb_w32_rename(const char *, const char *);
extern int rb_w32_urename(const char *, const char *);
@@ -611,7 +612,7 @@
#define get_osfhandle(h) rb_w32_get_osfhandle(h)
#undef getcwd
-#define getcwd(b, s) rb_w32_getcwd(b, s)
+#define getcwd(b, s) rb_w32_ugetcwd(b, s)
#undef getenv
#define getenv(n) rb_w32_getenv(n)
Index: win32/win32.c
===================================================================
--- win32/win32.c (revision 27644)
+++ win32/win32.c (working copy)
@@ -3692,6 +3692,57 @@
return p;
}
+char *
+rb_w32_ugetcwd(char *buffer, int size)
+{
+ char *p;
+ WCHAR *wp;
+ long len, wlen;
+
+ wlen = GetCurrentDirectoryW(0, NULL); // wlen includes null terminating character
+ if (!wlen) {
+ errno = map_errno(GetLastError());
+ return NULL;
+ }
+
+ wp = malloc(wlen * sizeof(WCHAR));
+ if (!wp) {
+ errno = ENOMEM;
+ return NULL;
+ }
+
+ if (!GetCurrentDirectoryW(wlen, wp)) {
+ errno = map_errno(GetLastError());
+ free(wp);
+ return NULL;
+ }
+
+ p = wstr_to_utf8(wp, &len);
+ free(wp);
+ len += 1; // len now includes null terminating character
+
+ if (!p) {
+ errno = ENOMEM;
+ return NULL;
+ }
+
+ if (buffer) {
+ if (size < len) {
+ free(p);
+ errno = ERANGE;
+ return NULL;
+ }
+
+ memcpy(buffer, p, len);
+ free(p);
+ p = buffer;
+ }
+
+ translate_char(p, '\\', '/');
+
+ return p;
+}
+
int
chown(const char *path, int owner, int group)
{
</code></pre>
<p>This works, in terms of returning a UTF-8 path string; however,<br>
rb_dir_getwd calls rb_enc_associate(cwd, rb_filesystem_encoding())<br>
on the result, associating the WINDOWS-1252 encoding instead of<br>
UTF-8.</p>
<p>So, I would like to ask: is there a reason<br>
enc_set_filesystem_encoding() should not return UTF-8 now for<br>
Windows?</p>
<p>static int<br>
enc_set_filesystem_encoding(void)<br>
{<br>
int idx;<br>
#if defined NO_LOCALE_CHARMAP<br>
idx = rb_enc_to_index(rb_default_external_encoding());<br>
#elif defined _WIN32 || defined <strong>CYGWIN</strong><br>
char cp[sizeof(int) * 8 / 3 + 4];<br>
snprintf(cp, sizeof cp, "CP%d", AreFileApisANSI() ? GetACP() : GetOEMCP());<br>
idx = rb_enc_find_index(cp);<br>
if (idx < 0) idx = rb_ascii8bit_encindex();<br>
#else<br>
idx = rb_enc_to_index(rb_default_external_encoding());<br>
#endif</p>
<pre><code> enc_alias_internal("filesystem", idx);
return idx;
</code></pre>
<p>}</p>
<p>It seems strange that it still selects non-unicode encodings.</p>
<hr>
<p>Also, my bootstraptest encountered one more problem. The mktmpdir<br>
can't delete the unicode directory entries created by my test:</p>
<p>P:/code/ruby-svn/trunk/lib/fileutils.rb:1307:in <code>unlink': Invalid argument - C:/temp/bootstraptest20100505-1016-1lvss6a.tmpwd/???? (Errno::EINVAL) from P:/code/ruby-svn/trunk/lib/fileutils.rb:1307:in </code>block in remove_file'<br>
from P:/code/ruby-svn/trunk/lib/fileutils.rb:1315:in <code>platform_support' from P:/code/ruby-svn/trunk/lib/fileutils.rb:1306:in </code>remove_file'<br>
from P:/code/ruby-svn/trunk/lib/fileutils.rb:1295:in <code>remove' from P:/code/ruby-svn/trunk/lib/fileutils.rb:761:in </code>block in remove_entry'<br>
from P:/code/ruby-svn/trunk/lib/fileutils.rb:1345:in <code>block (2 levels) in postorder_traverse' from P:/code/ruby-svn/trunk/lib/fileutils.rb:1349:in </code>postorder_traverse'<br>
from P:/code/ruby-svn/trunk/lib/fileutils.rb:1344:in <code>block in postorder_traverse' from P:/code/ruby-svn/trunk/lib/fileutils.rb:1343:in </code>each'<br>
from P:/code/ruby-svn/trunk/lib/fileutils.rb:1343:in <code>postorder_traverse' from P:/code/ruby-svn/trunk/lib/fileutils.rb:759:in </code>remove_entry'<br>
from P:/code/ruby-svn/trunk/lib/fileutils.rb:688:in <code>remove_entry_secure' from P:/code/ruby-svn/trunk/lib/tmpdir.rb:85:in </code>ensure in mktmpdir'<br>
from P:/code/ruby-svn/trunk/lib/tmpdir.rb:85:in <code>mktmpdir' from ./bootstraptest/runner.rb:375:in </code>in_temporary_working_directory'<br>
from ./bootstraptest/runner.rb:126:in <code>main' from ./bootstraptest/runner.rb:398:in </code>'</p>
<p>I don't have a patch for this yet. However, it looks like<br>
in win32.c, routines such as rb_w32_opendir and rb_w32_readdir_with_enc<br>
are already using WCHAR internally!</p>
<p>For example:</p>
<p>DIR *<br>
rb_w32_opendir(const char *filename)<br>
{<br>
struct stati64 sbuf;<br>
WIN32_FIND_DATAW fd;<br>
HANDLE fh;<br>
WCHAR *wpath;</p>
<pre><code> if (!(wpath = filecp_to_wstr(filename, NULL)))
return NULL;
</code></pre>
<p>... so it seems if filesystem encoding were considered UTF-8<br>
instead of WINDOWS-1252, then opendir might just work.</p>
<p>Similarly (somewhat) with rb_w32_readdir_with_enc. (At least,<br>
it does call readdir_internal, which uses WCHAR.)</p>
<p>So I <em>think</em> these are very close to working UTF-8, but, again,<br>
I don't understand why enc_set_filesystem_encoding() uses<br>
WINDOWS-1252 still.</p>
<p>Thanks,</p>
<p>Regards,</p>
<p>Bill</p>
<p>=end</p> Ruby master - Bug #1685: Some windows unicode path issues remainhttps://bugs.ruby-lang.org/issues/1685?journal_id=108262010-05-06T19:58:23Zusa (Usaku NAKAMURA)usa@garbagecollect.jp
<ul></ul><p>=begin<br>
Hello,</p>
<p>In message "<a href="https://blade.ruby-lang.org/ruby-core/30052">[ruby-core:30052]</a> Re: [Bug <a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: Some windows unicode path issues remain (Closed)" href="https://bugs.ruby-lang.org/issues/1685">#1685</a>] Some windows unicode path issues remain"<br>
on May.06,2010 19:39:27, <a href="mailto:billk@cts.com" class="email">billk@cts.com</a> wrote:</p>
<blockquote>
<p>This works, in terms of returning a UTF-8 path string; however,<br>
rb_dir_getwd calls rb_enc_associate(cwd, rb_filesystem_encoding())<br>
on the result, associating the WINDOWS-1252 encoding instead of<br>
UTF-8.</p>
<p>So, I would like to ask: is there a reason<br>
enc_set_filesystem_encoding() should not return UTF-8 now for<br>
Windows?</p>
</blockquote>
<p>For compatibility.</p>
<p>I will not change filesystem encoding in Windows in 1.9 series.<br>
In all methods which returns filenames, the default encoding<br>
of returned value must be filesystem encoding.<br>
So, if someone want to get filename with another encoding, he/she<br>
should specify the encoding by some way.<br>
Of course, it is necessary to decide the "some way" of each<br>
methods.</p>
<blockquote>
<p>Also, my bootstraptest encountered one more problem. The mktmpdir<br>
can't delete the unicode directory entries created by my test:</p>
</blockquote>
<p>Yes, I know it.<br>
This is the problem of globbing.<br>
I've already decided to solve this problem 1.9.3 or later.</p>
<a name="Regards"></a>
<h2 >Regards,<a href="#Regards" class="wiki-anchor">¶</a></h2>
<p>U.Nakamura <a href="mailto:usa@garbagecollect.jp" class="email">usa@garbagecollect.jp</a></p>
<p>=end</p> Ruby master - Bug #1685: Some windows unicode path issues remainhttps://bugs.ruby-lang.org/issues/1685?journal_id=108292010-05-06T21:38:44Zspatulasnout (B Kelly)billk@cts.com
<ul></ul><p>=begin<br>
Hi,</p>
<p>U.Nakamura wrote:</p>
<blockquote>
<p>In message "<a href="https://blade.ruby-lang.org/ruby-core/30052">[ruby-core:30052]</a> Re: [Bug <a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: Some windows unicode path issues remain (Closed)" href="https://bugs.ruby-lang.org/issues/1685">#1685</a>] Some windows unicode path issues remain"<br>
on May.06,2010 19:39:27, <a href="mailto:billk@cts.com" class="email">billk@cts.com</a> wrote:</p>
<blockquote>
<p>This works, in terms of returning a UTF-8 path string; however,<br>
rb_dir_getwd calls rb_enc_associate(cwd, rb_filesystem_encoding())<br>
on the result, associating the WINDOWS-1252 encoding instead of<br>
UTF-8.</p>
<p>So, I would like to ask: is there a reason<br>
enc_set_filesystem_encoding() should not return UTF-8 now for<br>
Windows?</p>
</blockquote>
<p>For compatibility.</p>
<p>I will not change filesystem encoding in Windows in 1.9 series.<br>
In all methods which returns filenames, the default encoding<br>
of returned value must be filesystem encoding.<br>
So, if someone want to get filename with another encoding, he/she<br>
should specify the encoding by some way.<br>
Of course, it is necessary to decide the "some way" of each<br>
methods.</p>
</blockquote>
<p>Ah.</p>
<p>So my rb_w32_ugetcwd patch is not very useful, at present,<br>
since there is no "some way" to specify the encoding via<br>
Dir.pwd.</p>
<p>May I suggest a new command line flag for this purpose:</p>
<p>ruby --DEAR_GOD_WORK_WITH_UTF_8_DAMN_IT</p>
<p>;)</p>
<p>Well then, this becomes a philosophical question at this point,<br>
but in an effort to better understand, I am wondering:</p>
<p>How does it break compatibility, if we allow filesystem encoding<br>
to become UTF-8 when rb_default_external_encoding is UTF-8?</p>
<p>Do we have evidence that anyone has written scripts that will<br>
break in such a case? (And if so, can we agree to summon the<br>
fleas of a thousand camels to infest their undergarments?)</p>
<blockquote>
<blockquote>
<p>Also, my bootstraptest encountered one more problem. The mktmpdir<br>
can't delete the unicode directory entries created by my test:</p>
</blockquote>
<p>Yes, I know it.<br>
This is the problem of globbing.<br>
I've already decided to solve this problem 1.9.3 or later.</p>
</blockquote>
<p>OK.</p>
<p>I admit I don't understand why it's considered a globbing problem.<br>
Does the UTF-8 support somehow make the globbing more difficult?<br>
I thought it was just the same situation as above: a filesystem<br>
encoding problem?</p>
<p>Regards,</p>
<p>Bill</p>
<p>=end</p> Ruby master - Bug #1685: Some windows unicode path issues remainhttps://bugs.ruby-lang.org/issues/1685?journal_id=108322010-05-06T22:27:15Zusa (Usaku NAKAMURA)usa@garbagecollect.jp
<ul></ul><p>=begin<br>
Hello,</p>
<p>In message "<a href="https://blade.ruby-lang.org/ruby-core/30054">[ruby-core:30054]</a> Re: [Bug <a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: Some windows unicode path issues remain (Closed)" href="https://bugs.ruby-lang.org/issues/1685">#1685</a>] Some windows unicode path issues remain"<br>
on May.06,2010 21:38:19, <a href="mailto:billk@cts.com" class="email">billk@cts.com</a> wrote:</p>
<blockquote>
<blockquote>
<p>For compatibility.</p>
<p>I will not change filesystem encoding in Windows in 1.9 series.<br>
In all methods which returns filenames, the default encoding<br>
of returned value must be filesystem encoding.<br>
So, if someone want to get filename with another encoding, he/she<br>
should specify the encoding by some way.<br>
Of course, it is necessary to decide the "some way" of each<br>
methods.</p>
</blockquote>
<p>Ah.</p>
<p>So my rb_w32_ugetcwd patch is not very useful, at present,<br>
since there is no "some way" to specify the encoding via<br>
Dir.pwd.</p>
</blockquote>
<p>Unfortunately...</p>
<blockquote>
<p>Well then, this becomes a philosophical question at this point,<br>
but in an effort to better understand, I am wondering:</p>
<p>How does it break compatibility, if we allow filesystem encoding<br>
to become UTF-8 when rb_default_external_encoding is UTF-8?</p>
</blockquote>
<p>You should advocate using default_internal instead of<br>
default_external :)<br>
It's acceptable for me.</p>
<blockquote>
<p>I admit I don't understand why it's considered a globbing problem.</p>
</blockquote>
<p>FileUtils uses Dir.entries to get filenames to remove.</p>
<blockquote>
<p>Does the UTF-8 support somehow make the globbing more difficult?<br>
I thought it was just the same situation as above: a filesystem<br>
encoding problem?</p>
</blockquote>
<p>Yes, you are right.</p>
<a name="Regards"></a>
<h2 >Regards,<a href="#Regards" class="wiki-anchor">¶</a></h2>
<p>U.Nakamura <a href="mailto:usa@garbagecollect.jp" class="email">usa@garbagecollect.jp</a></p>
<p>=end</p> Ruby master - Bug #1685: Some windows unicode path issues remainhttps://bugs.ruby-lang.org/issues/1685?journal_id=108442010-05-07T06:11:15Zspatulasnout (B Kelly)billk@cts.com
<ul></ul><p>=begin<br>
Hi,</p>
<p>U.Nakamura wrote:</p>
<blockquote>
<blockquote>
<p>How does it break compatibility, if we allow filesystem encoding<br>
to become UTF-8 when rb_default_external_encoding is UTF-8?</p>
</blockquote>
<p>You should advocate using default_internal instead of<br>
default_external :)<br>
It's acceptable for me.</p>
</blockquote>
<p>Ah, thanks, default_internal does make more sense. :)</p>
<p>Regarding advocacy: apart from yourself, who are the people<br>
who need to comment on this? Is this a question for Matz?<br>
Or... ?</p>
<p>Thanks,</p>
<p>Bill</p>
<p>=end</p> Ruby master - Bug #1685: Some windows unicode path issues remainhttps://bugs.ruby-lang.org/issues/1685?journal_id=108512010-05-07T11:56:03Zusa (Usaku NAKAMURA)usa@garbagecollect.jp
<ul></ul><p>=begin<br>
Hello,</p>
<p>In message "<a href="https://blade.ruby-lang.org/ruby-core/30071">[ruby-core:30071]</a> Re: [Bug <a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: Some windows unicode path issues remain (Closed)" href="https://bugs.ruby-lang.org/issues/1685">#1685</a>] Some windows unicode path issues remain"<br>
on May.07,2010 06:11:00, <a href="mailto:billk@cts.com" class="email">billk@cts.com</a> wrote:</p>
<blockquote>
<p>Regarding advocacy: apart from yourself, who are the people<br>
who need to comment on this? Is this a question for Matz?<br>
Or... ?</p>
</blockquote>
<p>I assume,</p>
<ol>
<li>matz: the grand designer of Ruby</li>
<li>naruse: an authority of Ruby M17N</li>
<li>me: main maintainer of Ruby on Windows</li>
<li>all users of Ruby, of course, especially people using<br>
non-unicode (multibyte) environment</li>
</ol>
<a name="Regards"></a>
<h2 >Regards,<a href="#Regards" class="wiki-anchor">¶</a></h2>
<p>U.Nakamura <a href="mailto:usa@garbagecollect.jp" class="email">usa@garbagecollect.jp</a></p>
<p>=end</p> Ruby master - Bug #1685: Some windows unicode path issues remainhttps://bugs.ruby-lang.org/issues/1685?journal_id=108522010-05-07T13:12:43Zspatulasnout (B Kelly)billk@cts.com
<ul></ul><p>=begin<br>
Hi,</p>
<p>U.Nakamura wrote:</p>
<blockquote>
<p>I assume,</p>
<ol>
<li>matz: the grand designer of Ruby</li>
<li>naruse: an authority of Ruby M17N</li>
<li>me: main maintainer of Ruby on Windows</li>
<li>all users of Ruby, of course, especially people using<br>
non-unicode (multibyte) environment</li>
</ol>
</blockquote>
<p>For #1 (sorry, I can't resist :)</p>
<p><a href="http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-core/20110" class="external">http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-core/20110</a></p>
<p>For #4, wouldn't we expect people using a non-unicode environment<br>
to <em>not</em> set their default_internal encoding to UTF-8 ? So I<br>
would think they would not be affected.</p>
<hr>
<p>As an aside, from the point of view of writing ruby software to<br>
be installed on arbitrary user's machines, I don't think there's<br>
any such thing as a non-unicode environment anymore.</p>
<p>The minute any of my users (even if they are English speaking)<br>
downloads a file from their web browser called<br>
私のホバークラフトは鰻でいっぱいです.mpg<br>
my application which uses Dir.entries to locate and catalog<br>
media files, is now broken on their system.</p>
<p>(Of course, since I distribute the Ruby interpreter with my<br>
application, I have the luxury of working around the problem,<br>
by installing a non-standard Ruby. But I still believe it's<br>
important for standard Ruby to have full Unicode support on<br>
Windows.)</p>
<p>Regards,</p>
<p>Bill</p>
<p>=end</p> Ruby master - Bug #1685: Some windows unicode path issues remainhttps://bugs.ruby-lang.org/issues/1685?journal_id=110802010-05-18T19:31:03Zspatulasnout (B Kelly)billk@cts.com
<ul></ul><p>=begin<br>
Hi,</p>
<p>Bill Kelly wrote:</p>
<blockquote>
<blockquote>
<ol>
<li>matz: the grand designer of Ruby</li>
<li>naruse: an authority of Ruby M17N</li>
<li>me: main maintainer of Ruby on Windows</li>
<li>all users of Ruby, of course, especially people using<br>
non-unicode (multibyte) environment</li>
</ol>
</blockquote>
<p>For #4, wouldn't we expect people using a non-unicode environment<br>
to <em>not</em> set their default_internal encoding to UTF-8 ? So I<br>
would think they would not be affected.</p>
</blockquote>
<p>Noticed an interesting ChangeLog entry from yesterday on<br>
ruby_1_9_2 branch:</p>
<p>Mon May 17 11:09:58 2010 NAKAMURA Usaku <a href="mailto:usa@ruby-lang.org" class="email">usa@ruby-lang.org</a></p>
<pre><code> merge from trunk (r27856, r27857)
* lib/fileutils.rb (FileUtils::Entry_#entries): returns pathname in
UTF-8 on Windows to allow FileUtils accessing all pathnames
internally.
</code></pre>
<a name="Index-libfileutilsrb"></a>
<h1 >Index: lib/fileutils.rb<a href="#Index-libfileutilsrb" class="wiki-anchor">¶</a></h1>
<p>--- lib/fileutils.rb (revision 27657)<br>
+++ lib/fileutils.rb (working copy)<br>
@@ -1176,7 +1176,9 @@<br>
end</p>
<pre><code> def entries
</code></pre>
<ul>
<li>
<pre><code> Dir.entries(path())\
</code></pre>
</li>
</ul>
<ul>
<li>
<pre><code> opts = {}
</code></pre>
</li>
<li>
<pre><code> opts[:encoding] = "UTF-8" if /mswin|mignw/ =~ RUBY_PLATFORM
</code></pre>
</li>
<li>
<pre><code> Dir.entries(path(), opts)\
.reject {|n| n == '.' or n == '..' }\
.map {|n| Entry_.new(prefix(), join(rel(), n.untaint)) }
</code></pre>
end<br>
===================================================================</li>
</ul>
<p>Would this approach also be considered for Dir.pwd:</p>
<p>result = Dir.pwd(:encoding => "UTF-8")</p>
<p>?</p>
<p>If so, I already have the rb_w32_ugetcwd implementation (presented<br>
in <a href="https://blade.ruby-lang.org/ruby-core/30052">[ruby-core:30052]</a> ).</p>
<p>I would be happy to provide a patch for Dir.pwd if this is<br>
acceptable.</p>
<p>Regards,</p>
<p>Bill</p>
<p>=end</p> Ruby master - Bug #1685: Some windows unicode path issues remainhttps://bugs.ruby-lang.org/issues/1685?journal_id=110882010-05-18T22:07:41Zusa (Usaku NAKAMURA)usa@garbagecollect.jp
<ul></ul><p>=begin<br>
Hello,</p>
<p>In message "<a href="https://blade.ruby-lang.org/ruby-core/30296">[ruby-core:30296]</a> Re: [Bug <a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: Some windows unicode path issues remain (Closed)" href="https://bugs.ruby-lang.org/issues/1685">#1685</a>] Some windows unicode path issues remain"<br>
on May.18,2010 19:30:53, <a href="mailto:billk@cts.com" class="email">billk@cts.com</a> wrote:</p>
<blockquote>
<p>Noticed an interesting ChangeLog entry from yesterday on<br>
ruby_1_9_2 branch:</p>
<p>Mon May 17 11:09:58 2010 NAKAMURA Usaku <a href="mailto:usa@ruby-lang.org" class="email">usa@ruby-lang.org</a></p>
<pre><code> merge from trunk (r27856, r27857)
* lib/fileutils.rb (FileUtils::Entry_#entries): returns pathname in
UTF-8 on Windows to allow FileUtils accessing all pathnames
internally.
</code></pre>
</blockquote>
<p>In this case, Dir.entries already has its own "some way".<br>
So I can use it.</p>
<blockquote>
<p>Would this approach also be considered for Dir.pwd:</p>
<p>result = Dir.pwd(:encoding => "UTF-8")</p>
<p>?</p>
</blockquote>
<p>This might be a moot point.<br>
For instance, there might be an insistence that Dir.pwd<br>
should accept only the encoding because there is no possiblity<br>
that it takes other arguments.</p>
<a name="Regards"></a>
<h2 >Regards,<a href="#Regards" class="wiki-anchor">¶</a></h2>
<p>U.Nakamura <a href="mailto:usa@garbagecollect.jp" class="email">usa@garbagecollect.jp</a></p>
<p>=end</p> Ruby master - Bug #1685: Some windows unicode path issues remainhttps://bugs.ruby-lang.org/issues/1685?journal_id=110982010-05-19T16:49:05Zspatulasnout (B Kelly)billk@cts.com
<ul></ul><p>=begin<br>
Hi,</p>
<p>U.Nakamura wrote:</p>
<blockquote>
<p>In message "<a href="https://blade.ruby-lang.org/ruby-core/30296">[ruby-core:30296]</a> Re: [Bug <a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: Some windows unicode path issues remain (Closed)" href="https://bugs.ruby-lang.org/issues/1685">#1685</a>] Some windows unicode path issues remain"<br>
on May.18,2010 19:30:53, <a href="mailto:billk@cts.com" class="email">billk@cts.com</a> wrote:</p>
<blockquote>
<p>Would this approach also be considered for Dir.pwd:</p>
<p>result = Dir.pwd(:encoding => "UTF-8")</p>
<p>?</p>
</blockquote>
<p>This might be a moot point.<br>
For instance, there might be an insistence that Dir.pwd<br>
should accept only the encoding because there is no possiblity<br>
that it takes other arguments.</p>
</blockquote>
<p>Any solution would be fine with me. :)</p>
<p>Thanks to your finding a solution for Dir.entries, it seems<br>
we are approaching nearly 100% unicode path capability for<br>
win32!</p>
<p>Do you anticipate it will be difficult to reach a decision<br>
regarding:</p>
<p>result = Dir.pwd(:encoding => "UTF-8")<br>
vs.<br>
result = Dir.pwd("UTF-8")<br>
vs.<br>
(some other way)</p>
<p>?</p>
<p>Thanks,</p>
<p>Regards,</p>
<p>Bill</p>
<p>=end</p> Ruby master - Bug #1685: Some windows unicode path issues remainhttps://bugs.ruby-lang.org/issues/1685?journal_id=114262010-06-03T10:29:54Zusa (Usaku NAKAMURA)usa@garbagecollect.jp
<ul><li><strong>Category</strong> changed from <i>core</i> to <i>M17N</i></li><li><strong>Priority</strong> changed from <i>5</i> to <i>Normal</i></li><li><strong>Target version</strong> changed from <i>1.9.2</i> to <i>2.0.0</i></li></ul><p>=begin</p>
<p>=end</p> Ruby master - Bug #1685: Some windows unicode path issues remainhttps://bugs.ruby-lang.org/issues/1685?journal_id=345592012-12-09T21:40:37Zmame (Yusuke Endoh)mame@ruby-lang.org
<ul><li><strong>Description</strong> updated (<a title="View differences" href="/journals/34559/diff?detail_id=24315">diff</a>)</li></ul><p>Usa-san, what's the status?</p>
<p>--<br>
Yusuke Endoh <a href="mailto:mame@tsg.ne.jp" class="email">mame@tsg.ne.jp</a></p> Ruby master - Bug #1685: Some windows unicode path issues remainhttps://bugs.ruby-lang.org/issues/1685?journal_id=365012013-02-18T21:07:26Zmame (Yusuke Endoh)mame@ruby-lang.org
<ul><li><strong>Target version</strong> changed from <i>2.0.0</i> to <i>2.6</i></li></ul><p>Usa-san, what's the status?</p>
<p>--<br>
Yusuke Endoh <a href="mailto:mame@tsg.ne.jp" class="email">mame@tsg.ne.jp</a></p> Ruby master - Bug #1685: Some windows unicode path issues remainhttps://bugs.ruby-lang.org/issues/1685?journal_id=460612014-04-03T14:07:17Zthomthom (Thomas Thomassen)thomas@thomthom.net
<ul></ul><p>B Kelly wrote:</p>
<blockquote>
<p>=begin<br>
Thanks to your finding a solution for Dir.entries, it seems<br>
we are approaching nearly 100% unicode path capability for<br>
win32!<br>
=end</p>
</blockquote>
<p>In Ruby 2.0 there appear to still be several issues with Ruby and Unicode characters in filenames. Dir.entries fail, load and require fail. <strong>FILE</strong> has the wrong encoding. I see some things slated for Ruby 2.2, but not everything.</p> Ruby master - Bug #1685: Some windows unicode path issues remainhttps://bugs.ruby-lang.org/issues/1685?journal_id=460702014-04-04T10:38:26Zduerst (Martin Dürst)duerst@it.aoyama.ac.jp
<ul></ul><p>On 2014/04/03 23:07, <a href="mailto:thomas@thomthom.net" class="email">thomas@thomthom.net</a> wrote:</p>
<blockquote>
<p>Issue <a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: Some windows unicode path issues remain (Closed)" href="https://bugs.ruby-lang.org/issues/1685">#1685</a> has been updated by Thomas Thomassen.</p>
</blockquote>
<blockquote>
<p>In Ruby 2.0 there appear to still be several issues with Ruby and Unicode characters in filenames. Dir.entries fail, load and require fail. <strong>FILE</strong> has the wrong encoding. I see some things slated for Ruby 2.2, but not everything.</p>
</blockquote>
<p>If you know of anything that's not yet in Ruby 2.2, please tell us, best<br>
by opening a bug for each issue.</p>
<p>Regards, Martin.</p>
<blockquote>
<hr>
<p>Bug <a class="issue tracker-1 status-5 priority-4 priority-default closed" title="Bug: Some windows unicode path issues remain (Closed)" href="https://bugs.ruby-lang.org/issues/1685">#1685</a>: Some windows unicode path issues remain<br>
<a href="https://bugs.ruby-lang.org/issues/1685#change-46061" class="external">https://bugs.ruby-lang.org/issues/1685#change-46061</a></p>
<ul>
<li>Author: B Kelly</li>
<li>Status: Assigned</li>
<li>Priority: Normal</li>
<li>Assignee: Usaku NAKAMURA</li>
<li>Category: M17N</li>
<li>Target version: next minor</li>
<li>ruby -v: ruby 1.9.2dev (2009-06-24) [i386-mswin32_71]</li>
<li>Backport:</li>
</ul>
<hr>
<p>=begin<br>
Hi,</p>
<p>I see some nice progress has been made in unicode path<br>
handling on windows.</p>
<p>The following tests are not exhaustive, but do reveal some<br>
remaining issues.</p>
<p>Everything below "NOT WORKING" fails in one way or another.</p>
<p>Regards,</p>
<p>Bill</p>
<pre><code># encoding: UTF-8
# Test unicode path/dir handling on windows
require 'test/unit'
class TestUnicodeFilenamesAndPaths < Test::Unit::TestCase
def setup
tmpdir = ENV['TEMP'] || "C:/TEMP"
Dir.chdir tmpdir
puts Dir.pwd
testdir = "ruby_unicode_test"
Dir.mkdir testdir unless test ?d, testdir
Dir.chdir testdir
puts Dir.pwd
end
def test_unicode_paths
fname_resume = "R\xC3\xA9sum\xC3\xA9".force_encoding("UTF-8")
fname_chinese = "\u52ec\u52ee\u52f1\u52f2.txt"
dname_chinese = "\u52ec\u52ee\u52f1\u52f2"
assert_equal( "UTF-8", fname_resume.encoding.name )
File.open(fname_resume, "w") {|io| io.puts "Hello, World"}
assert_equal( "UTF-8", fname_chinese.encoding.name )
File.open(fname_chinese, "w") {|io| io.puts "Hello, World"}
dat = File.read(fname_chinese)
assert_equal( "Hello, World\n", dat )
files = Dir["*"]
assert( files.include? fname_resume )
assert( files.include? fname_chinese )
# NOT WORKING:
Dir.rmdir dname_chinese rescue nil
Dir.mkdir dname_chinese
test ?d, dname_chinese
Dir.chdir dname_chinese
cwd = Dir.pwd
assert( cwd[(-dname_chinese.length)..-1] == dname_chinese )
Dir.chdir ".."
x = File.stat(fname_resume)
x = File.stat(fname_chinese)
x = File.stat(dname_chinese)
assert( File.exist? fname_resume )
assert( File.exist? fname_chinese )
assert( test(?f, fname_resume) )
assert( test(?f, fname_chinese) )
files = Dir[fname_resume]
assert_equal( fname_resume, files.first )
files = Dir[fname_chinese]
assert_equal( fname_chinese, files.first )
files = Dir[dname_chinese]
assert_equal( dname_chinese, files.first )
end
end
=end
---Files--------------------------------
spatulasnout-unicode-mkdir-diffs.txt (3.56 KB)
test_io_unicode_paths.rb (925 Bytes)
</code></pre>
</blockquote> Ruby master - Bug #1685: Some windows unicode path issues remainhttps://bugs.ruby-lang.org/issues/1685?journal_id=460722014-04-04T11:45:51Zthomthom (Thomas Thomassen)thomas@thomthom.net
<ul></ul><p>Martin Dürst wrote:</p>
<blockquote>
<p>If you know of anything that's not yet in Ruby 2.2, please tell us, best<br>
by opening a bug for each issue.</p>
</blockquote>
<p>I've been setting up tests and running them through Ruby 2.2 I find some are fixed but there is still several issues related to file handling. We'll be filing issues for what we have uncovered.</p> Ruby master - Bug #1685: Some windows unicode path issues remainhttps://bugs.ruby-lang.org/issues/1685?journal_id=461052014-04-07T18:50:53Zusa (Usaku NAKAMURA)usa@garbagecollect.jp
<ul><li><strong>Status</strong> changed from <i>Assigned</i> to <i>Closed</i></li></ul><p>This ticket is too old and too various problems.<br>
Now Thomas investigates many things and is making some new tickets. (Thank you!)<br>
Please refer to them from now on.</p>