Project

General

Profile

Bug #15993

'require' doesn't work if there are Cyrillic chars in the path to Ruby dir

Added by inversion (Yura Babak) 12 months ago. Updated 11 months ago.

Status:
Open
Priority:
Normal
Assignee:
-
Target version:
-
ruby -v:
ruby 2.6.3p62 (2019-04-16 revision 67580) [x64-mingw32]
[ruby-core:93655]

Description

I’m trying to build a cross-platform portable application with Ruby onboard and there is a problem on Windows.
A user usually installs it to the Roaming folder which sits inside a user folder which can often have not a Latin name or contain spaces).
When there is a Cyrillic character (maybe just not Latin) in the path — require of any gem doesn’t work:

D:\users\киї\Ruby\2.6\bin>ruby -v
ruby 2.6.3p62 (2019-04-16 revision 67580) [x64-mingw32]

D:\users\киї\Ruby\2.6\bin>ruby -e "require 'logger'"
Traceback (most recent call last):
        1: from <internal:gem_prelude>:2:in `<internal:gem_prelude>'
<internal:gem_prelude>:2:in `require': No such file or directory -- D:/users/РєРёС—/Ruby/2.6/lib/ruby/2.6.0/rubygems.rb (LoadError)

D:\users\киї\Ruby\2.6\bin>ruby --disable=rubyopt -e "require 'logger'"
Traceback (most recent call last):
        1: from <internal:gem_prelude>:2:in `<internal:gem_prelude>'
<internal:gem_prelude>:2:in `require': No such file or directory -- D:/users/РєРёС—/Ruby/2.6/lib/ruby/2.6.0/rubygems.rb (LoadError)

D:\users\киї\Ruby\2.6\bin>gem list
Traceback (most recent call last):
        1: from <internal:gem_prelude>:2:in `<internal:gem_prelude>'
<internal:gem_prelude>:2:in `require': No such file or directory -- D:/users/РєРёС—/Ruby/2.6/lib/ruby/2.6.0/rubygems.rb (LoadError)

We can see such encoding transformations in the output:

киї (utf-8) == РєРёС— (win1251)

I have an old Ruby installation that works fine:

D:\users\киї\Ruby\2.0\bin>ruby -e "require 'logger'"

D:\users\киї\Ruby\2.0\bin>ruby -v
ruby 2.0.0p451 (2014-02-24) [i386-mingw32]

The same is for ruby 2.0.0p643 (2015-02-25) [i386-mingw32] .

I also checked that require fails in the same case for
ruby 2.1.9p490 (2016-03-30 revision 54437) [i386-mingw32]

Updated by inversion (Yura Babak) 12 months ago

Looks like there is an ugly workaround.

1) Ensure to do chcp 1251 in the current console session.
2) Run Ruby with an option --disable=gems so it will not fail initially.
3) Add next code at the very beginning of a script:

if $:[0].encoding.name == 'Windows-1251'
    $:.each {|path| path.encode! 'UTF-8' }
    $:.push '.'    # somehow it helps, looks like a modification of array is needed
    require 'rubygems'
end

This helped me to overcome the problem and run my script from a folder with Cyrillic and spaces in the path.

But it definitely should be fixed.

Updated by duerst (Martin Dürst) 11 months ago

ko1 (Koichi Sasada): I can check whether this bug is reproducible. But I'm not too familiar with how Ruby deals with the Windows file system. So I'm not confident I will be able to find and fix this bug.

Updated by MSP-Greg (Greg L) 11 months ago

On a US Windows system, I used a base Ruby folder of C:\Greg\Ruby киї (using a space and Cyrillic characters), and I could repo the issue.

Without any console chcp command, I did the following, which also solved the issue:

# start ruby with --disable=gems
$:.map! { |path| path.dup.force_encoding 'UTF-8' }
require 'rubygems'

require 'openssl'
puts OpenSSL::VERSION

I don't think spaces in Windows paths is an issue anymore, but I haven't rigorously checked...

Updated by MSP-Greg (Greg L) 11 months ago

While taking a break, looked at this again. Below is the encoding of various items:

$LOAD_PATH
ASCII-8BIT      C:/Greg/Ruby киї/lib/ruby/site_ruby/2.7.0
ASCII-8BIT      C:/Greg/Ruby киї/lib/ruby/site_ruby/2.7.0/x64-msvcrt
ASCII-8BIT      C:/Greg/Ruby киї/lib/ruby/site_ruby
ASCII-8BIT      C:/Greg/Ruby киї/lib/ruby/vendor_ruby/2.7.0
ASCII-8BIT      C:/Greg/Ruby киї/lib/ruby/vendor_ruby/2.7.0/x64-msvcrt
ASCII-8BIT      C:/Greg/Ruby киї/lib/ruby/vendor_ruby
ASCII-8BIT      C:/Greg/Ruby киї/lib/ruby/2.7.0
ASCII-8BIT      C:/Greg/Ruby киї/lib/ruby/2.7.0/x64-mingw32

IBM437          __FILE__
IBM437          __dir__
UTF-8           Dir.pwd

The encoding wasn't affected by using -E in RUBYOPT.

Tested using today's trunk.

Also available in: Atom PDF