Project

General

Profile

Actions

Bug #13571

closed

Script arguments, encoding, windows / MinGW

Added by MSP-Greg (Greg L) almost 7 years ago. Updated 7 months ago.

Status:
Closed
Assignee:
-
Target version:
-
ruby -v:
ruby 2.5.0dev (2017-05-17 trunk 58774) [x64-mingw32]
[ruby-core:81218]

Description

The following is windows/MinGW specific.

I have been patching around a failure in TestRubyOptions#test_command_line_progname_nonascii for a while, and decided to investigate further, assuming it was a simple issue.

The following code -

# set dir to valid directory
dir = "E:/temp"
fn_1 = 'äÖü.rb'
fn_2 = 'テスト.rb'

puts
puts "ENV['RUBYOPT'] #{ENV['RUBYOPT']}" \
   "\nfilesystem     #{Encoding.find('filesystem')}" \
   "\nexternal       #{Encoding.default_external}" \
   "\n`chcp`         #{`chcp`}" \
     "locale         #{Encoding.find('locale')}" \
   "\n__ENCODING__   #{__ENCODING__}" \
   "\ninternal       #{Encoding.default_internal}\n" \
   "\nfn_1           #{fn_1.encoding.to_s}" \
   "\näÖü.rb         #{'äÖü.rb'.encoding.to_s}" \
   "\nfn_2           #{fn_2.encoding.to_s}" \
   "\nテスト.rb       #{'テスト.rb'.encoding.to_s}"

Dir.chdir(dir) do |dir|
  open(fn_1, "w") { |f| f.puts "puts File.basename($0)" }
  open(fn_2, "w") { |f| f.puts "puts File.basename($0)" }
  
  puts
  puts "File.exist?(fn_1)  #{File.exist?(fn_1)}"
  puts "File.exist?(fn_2)  #{File.exist?(fn_2)}"
  puts
  puts `ruby #{fn_1}`
  puts `ruby #{fn_2}`
end

produces the following output -

ENV['RUBYOPT']
filesystem     Windows-1252
external       IBM437
`chcp`         Active code page: 437
locale         IBM437
__ENCODING__   UTF-8
internal

fn_1           UTF-8
äÖü.rb         UTF-8
fn_2           UTF-8
テスト.rb       UTF-8

File.exist?(fn_1)  true
File.exist?(fn_2)  true

äÖü.rb: No such file or directory @ realpath_rec - E:/temp/„™�.rb (Errno::ENOENT)

ruby: Invalid argument -- ???.rb (LoadError)

Of note is that both files are created (and appear correctly in Explorer), but neither can be used as a script argument.

Copy and pasting the names (from Explorer) on the command line produces the following similar output -

E:\temp>ruby äÖü.rb
äÖü.rb: No such file or directory @ realpath_rec - E:/temp/„™�.rb (Errno::ENOENT)

E:\temp>ruby テスト.rb
ruby: Invalid argument -- ???.rb (LoadError)

Lastly, if before running the code I do a chcp 1252 command to match up 'locale' and 'filesystem', the first script file runs and produces the correct output.

Hence, it appears that some (or all?) File methods deal properly with filesystem encoding, but ruby script arguments are not correctly encoded.

Updated by MSP-Greg (Greg L) over 6 years ago

After seeing the CI test issue with 60743, I revisited this with a prior build (ruby 2.5.0dev (2017-11-11 trunk 60742) [x64-mingw32]).

The issue has been resolved, and using the above code, ruby loads both files. I believe @nobu (Nobuyoshi Nakada) has been responsible for most of the encoding related commits, so thanks again for your work.

Okay to close.

Updated by jeremyevans0 (Jeremy Evans) 7 months ago

  • Status changed from Open to Closed

Starting in Ruby 3.0, arguments provided to Ruby (in ARGV) are now in correct UTF-8 encoding.

Actions

Also available in: Atom PDF

Like0
Like0Like0