Bug #8676

ruby 2.0 can not require or load the source file with non-ascii path name

Added by 贾 延平 over 1 year ago. Updated 9 months ago.

[ruby-core:56136]
Status:Closed
Priority:Normal
Assignee:-
ruby -v:ruby 2.0.0p277 (2013-07-23 revision 42121) [i386-mingw32] Backport:1.9.3: UNKNOWN, 2.0.0: REQUIRED

Description

=begin
Sorry for my poor english:)

I attached the patch to fix the problem, but I don't know is it the right way.

Changelog:
include/ruby/intern.h change the declaration of "rb_load_file";change parameter from char to VALUE
*load.c change the caller
*ruby.c change the "rb_load_file"'s implement
*win32/file.c change the win32 api call from ANSI type to UNICODE type.
=end

win32_file_open_patch.patch Magnifier - patch file (2.3 KB) 贾 延平, 07/24/2013 11:01 AM

win32_file_open_patch.patch Magnifier (4.96 KB) 贾 延平, 07/26/2013 09:37 AM


Related issues

Related to Ruby trunk - Bug #9699: Cannot require .so file on Windows if the file path is un... Closed 04/03/2014

Associated revisions

Revision 42183
Added by Nobuyoshi Nakada over 1 year ago

load.c: search in OS path encoding

  • load.c (rb_load_internal): use rb_load_file_str() to keep path encoding.
  • load.c (rb_require_safe): search in OS path encoding for Windows.
  • ruby.c (rb_load_file_str): load file with keeping path encoding.
  • win32/file.c (rb_file_load_ok): use WCHAR type API assuming incoming path is encoded in UTF-8. [Bug #8676]

Revision 42183
Added by Nobuyoshi Nakada over 1 year ago

load.c: search in OS path encoding

  • load.c (rb_load_internal): use rb_load_file_str() to keep path encoding.
  • load.c (rb_require_safe): search in OS path encoding for Windows.
  • ruby.c (rb_load_file_str): load file with keeping path encoding.
  • win32/file.c (rb_file_load_ok): use WCHAR type API assuming incoming path is encoded in UTF-8. [Bug #8676]

History

#1 Updated by Charlie Somerville over 1 year ago

This patch looks good to me for trunk. It can't be backported though because that would break API compatibility.

#2 Updated by Nobuyoshi Nakada over 1 year ago

  • Subject changed from ruby 2.0 can not require or load the source file with utf-8 encoding and non-asii chars to ruby 2.0 can not require or load the source file with non-ascii path name
  • Description updated (diff)
  • Category changed from platform/mingw to platform/windows

#3 Updated by 贾 延平 over 1 year ago

By the way,the file nacl/pepper_main.c has a same name function at line 824.
Is the file need change too?

#4 Updated by 贾 延平 over 1 year ago

Update the patch
Using the encoding from path name

#5 Updated by Nobuyoshi Nakada over 1 year ago

  • Status changed from Open to Closed
  • % Done changed from 0 to 100

This issue was solved with changeset r42183.
贾, thank you for reporting this issue.
Your contribution to Ruby is greatly appreciated.
May Ruby be with you.


load.c: search in OS path encoding

  • load.c (rb_load_internal): use rb_load_file_str() to keep path encoding.
  • load.c (rb_require_safe): search in OS path encoding for Windows.
  • ruby.c (rb_load_file_str): load file with keeping path encoding.
  • win32/file.c (rb_file_load_ok): use WCHAR type API assuming incoming path is encoded in UTF-8. [Bug #8676]

#6 Updated by 贾 延平 over 1 year ago

There is a new bug from the change.
FILE's encoding is wrong.
In file ruby.c and function load_file_internal type VALUE convert to char*, and lost the encoding info
(({const char orig_fname = StringValueCStr(argp->fname);}))
and call the *rb_parser_compile_file
with the char* variable
(({ tree = rb_parser_compile_file(parser, orig_fname, f, line_start);}))
and in file parse.y and function rb_parser_compile_file, type char* convert to VALUE using the filesystem encoding.
(({ return rb_parser_compile_file_path(vparser, rb_filesystem_str_new_cstr(f), file, start);}))
But the beginning parameter argp->fname's encoding not the filesystem encoding.

What's the principle of ruby's encoding?Why so many VALUE to char* converting?Is the char* has the regular encoding?

I think we should convert the encoding at the boundary and every thing should encoding to internal encoding in the internal, am I right?

#7 Updated by Thomas Thomassen over 1 year ago

This patch makes the file functions explicitly call the *W version of the file functions. Isn't it better to provide the UNICODE compile flag. http://msdn.microsoft.com/en-us/library/cc194801.aspx

CreateFile, and the likes, would then call CreateFileW instead of CreateFileA.

#8 Updated by Usaku NAKAMURA 9 months ago

  • Related to Bug #9699: Cannot require .so file on Windows if the file path is unicode (Includes patch) added

#9 Updated by Usaku NAKAMURA 9 months ago

  • Backport changed from 1.9.3: UNKNOWN, 2.0.0: UNKNOWN to 1.9.3: UNKNOWN, 2.0.0: REQUIRED

Also available in: Atom PDF