Bug #13571
openScript arguments, encoding, windows / MinGW
Description
The following is windows/MinGW specific.
I have been patching around a failure in TestRubyOptions#test_command_line_progname_nonascii for a while, and decided to investigate further, assuming it was a simple issue.
The following code -
# set dir to valid directory
dir = "E:/temp"
fn_1 = 'äÖü.rb'
fn_2 = 'テスト.rb'
puts
puts "ENV['RUBYOPT'] #{ENV['RUBYOPT']}" \
"\nfilesystem #{Encoding.find('filesystem')}" \
"\nexternal #{Encoding.default_external}" \
"\n`chcp` #{`chcp`}" \
"locale #{Encoding.find('locale')}" \
"\n__ENCODING__ #{__ENCODING__}" \
"\ninternal #{Encoding.default_internal}\n" \
"\nfn_1 #{fn_1.encoding.to_s}" \
"\näÖü.rb #{'äÖü.rb'.encoding.to_s}" \
"\nfn_2 #{fn_2.encoding.to_s}" \
"\nテスト.rb #{'テスト.rb'.encoding.to_s}"
Dir.chdir(dir) do |dir|
open(fn_1, "w") { |f| f.puts "puts File.basename($0)" }
open(fn_2, "w") { |f| f.puts "puts File.basename($0)" }
puts
puts "File.exist?(fn_1) #{File.exist?(fn_1)}"
puts "File.exist?(fn_2) #{File.exist?(fn_2)}"
puts
puts `ruby #{fn_1}`
puts `ruby #{fn_2}`
end
produces the following output -
ENV['RUBYOPT']
filesystem Windows-1252
external IBM437
`chcp` Active code page: 437
locale IBM437
__ENCODING__ UTF-8
internal
fn_1 UTF-8
äÖü.rb UTF-8
fn_2 UTF-8
テスト.rb UTF-8
File.exist?(fn_1) true
File.exist?(fn_2) true
äÖü.rb: No such file or directory @ realpath_rec - E:/temp/„™�.rb (Errno::ENOENT)
ruby: Invalid argument -- ???.rb (LoadError)
Of note is that both files are created (and appear correctly in Explorer), but neither can be used as a script argument.
Copy and pasting the names (from Explorer) on the command line produces the following similar output -
E:\temp>ruby äÖü.rb
äÖü.rb: No such file or directory @ realpath_rec - E:/temp/„™�.rb (Errno::ENOENT)
E:\temp>ruby テスト.rb
ruby: Invalid argument -- ???.rb (LoadError)
Lastly, if before running the code I do a chcp 1252
command to match up 'locale' and 'filesystem', the first script file runs and produces the correct output.
Hence, it appears that some (or all?) File
methods deal properly with filesystem
encoding, but ruby script arguments are not correctly encoded.
Updated by MSP-Greg (Greg L) over 5 years ago
After seeing the CI test issue with 60743, I revisited this with a prior build (ruby 2.5.0dev (2017-11-11 trunk 60742) [x64-mingw32]).
The issue has been resolved, and using the above code, ruby loads both files. I believe @nobu (Nobuyoshi Nakada) has been responsible for most of the encoding related commits, so thanks again for your work.
Okay to close.