Bug #15993
open'require' doesn't work if there are Cyrillic chars in the path to Ruby dir
Description
I’m trying to build a cross-platform portable application with Ruby onboard and there is a problem on Windows.
A user usually installs it to the Roaming folder which sits inside a user folder which can often have not a Latin name or contain spaces).
When there is a Cyrillic character (maybe just not Latin) in the path — require of any gem doesn’t work:
D:\users\киї\Ruby\2.6\bin>ruby -v
ruby 2.6.3p62 (2019-04-16 revision 67580) [x64-mingw32]
D:\users\киї\Ruby\2.6\bin>ruby -e "require 'logger'"
Traceback (most recent call last):
1: from <internal:gem_prelude>:2:in `<internal:gem_prelude>'
<internal:gem_prelude>:2:in `require': No such file or directory -- D:/users/РєРёС—/Ruby/2.6/lib/ruby/2.6.0/rubygems.rb (LoadError)
D:\users\киї\Ruby\2.6\bin>ruby --disable=rubyopt -e "require 'logger'"
Traceback (most recent call last):
1: from <internal:gem_prelude>:2:in `<internal:gem_prelude>'
<internal:gem_prelude>:2:in `require': No such file or directory -- D:/users/РєРёС—/Ruby/2.6/lib/ruby/2.6.0/rubygems.rb (LoadError)
D:\users\киї\Ruby\2.6\bin>gem list
Traceback (most recent call last):
1: from <internal:gem_prelude>:2:in `<internal:gem_prelude>'
<internal:gem_prelude>:2:in `require': No such file or directory -- D:/users/РєРёС—/Ruby/2.6/lib/ruby/2.6.0/rubygems.rb (LoadError)
We can see such encoding transformations in the output:
киї (utf-8) == РєРёС— (win1251)
I have an old Ruby installation that works fine:
D:\users\киї\Ruby\2.0\bin>ruby -e "require 'logger'"
D:\users\киї\Ruby\2.0\bin>ruby -v
ruby 2.0.0p451 (2014-02-24) [i386-mingw32]
The same is for ruby 2.0.0p643 (2015-02-25) [i386-mingw32]
.
I also checked that require fails in the same case for
ruby 2.1.9p490 (2016-03-30 revision 54437) [i386-mingw32]
Updated by inversion (Yura Babak) over 5 years ago
Looks like there is an ugly workaround.
- Ensure to do
chcp 1251
in the current console session. - Run Ruby with an option
--disable=gems
so it will not fail initially. - Add next code at the very beginning of a script:
if $:[0].encoding.name == 'Windows-1251'
$:.each {|path| path.encode! 'UTF-8' }
$:.push '.' # somehow it helps, looks like a modification of array is needed
require 'rubygems'
end
This helped me to overcome the problem and run my script from a folder with Cyrillic and spaces in the path.
But it definitely should be fixed.
Updated by duerst (Martin Dürst) over 5 years ago
@ko1 (Koichi Sasada): I can check whether this bug is reproducible. But I'm not too familiar with how Ruby deals with the Windows file system. So I'm not confident I will be able to find and fix this bug.
Updated by MSP-Greg (Greg L) over 5 years ago
On a US Windows system, I used a base Ruby folder of C:\Greg\Ruby киї
(using a space and Cyrillic characters), and I could repo the issue.
Without any console chcp command, I did the following, which also solved the issue:
# start ruby with --disable=gems
$:.map! { |path| path.dup.force_encoding 'UTF-8' }
require 'rubygems'
require 'openssl'
puts OpenSSL::VERSION
I don't think spaces in Windows paths is an issue anymore, but I haven't rigorously checked...
Updated by MSP-Greg (Greg L) over 5 years ago
While taking a break, looked at this again. Below is the encoding of various items:
$LOAD_PATH
ASCII-8BIT C:/Greg/Ruby киї/lib/ruby/site_ruby/2.7.0
ASCII-8BIT C:/Greg/Ruby киї/lib/ruby/site_ruby/2.7.0/x64-msvcrt
ASCII-8BIT C:/Greg/Ruby киї/lib/ruby/site_ruby
ASCII-8BIT C:/Greg/Ruby киї/lib/ruby/vendor_ruby/2.7.0
ASCII-8BIT C:/Greg/Ruby киї/lib/ruby/vendor_ruby/2.7.0/x64-msvcrt
ASCII-8BIT C:/Greg/Ruby киї/lib/ruby/vendor_ruby
ASCII-8BIT C:/Greg/Ruby киї/lib/ruby/2.7.0
ASCII-8BIT C:/Greg/Ruby киї/lib/ruby/2.7.0/x64-mingw32
IBM437 __FILE__
IBM437 __dir__
UTF-8 Dir.pwd
The encoding wasn't affected by using -E
in RUBYOPT
.
Tested using today's trunk.
Updated by jeremyevans0 (Jeremy Evans) over 4 years ago
- Related to Bug #15655: Unable to handle Russian dirname on Windows added
Updated by tschoening (Thorsten Schöning) about 4 years ago
I think I have a similar problem originally reported at GitHub already:
https://github.com/rubygems/rubygems/issues/3853
I have a Ruby-based shell application which needs to require a library during startup. I'm using the following command line:
"..\ruby\bin\ruby.exe" "-I../runtime/lib" "../visualizer/bin/ksv" "--require=de/[...]/par_opp_dispatcher.rb" "--opaque-types=true" "../files_to_show/recs_clt.bin" "de/[...]/par_recs_clt.rb"
This results in the following error, while the first line describes the current directory I'm in. It contains some German umlaut ü
. Using an ASCII-only path, things work as expected.
C:\[...]\Müller electronic\[...]\ks_ruby_visualizer>show.cmd
Traceback (most recent call last):
1: from <internal:gem_prelude>:2:in `<internal:gem_prelude>'
<internal:gem_prelude>:2:in `require': No such file or directory -- C:/[...]/Müller electronic/[...]/rubygems.rb (LoadError)
The problem seems to be that at some point Ruby really seems to forward UTF-8 encoded bytes into the file system and such a path simply doesn't exist. The interesting thing is that many times before the path is forwarded correctly according to the following ProcMon-log:
18:57:48,7938985 ruby.exe 15296 CreateFile C:\[...]\Müller electronic\[...]\rubygems.rb SUCCESS Desired Access: Read Attributes, Disposition: Open, Options: Open Reparse Point, Attributes: n/a, ShareMode: Read, Write, Delete, AllocationSize: n/a, OpenResult: Opened
18:57:48,7940217 ruby.exe 15296 QueryBasicInformationFile C:\[...]\Müller electronic\[...]\rubygems.rb SUCCESS CreationTime: 24.07.2020 14:48:44, LastAccessTime: 24.07.2020 14:48:44, LastWriteTime: 01.10.2019 23:01:05, ChangeTime: 04.02.2020 22:30:28, FileAttributes: A 0x80000
18:57:48,7940500 ruby.exe 15296 CloseFile C:\[...]\Müller electronic\[...]\rubygems.rb SUCCESS
18:57:48,7942644 ruby.exe 15296 CreateFile C:\[...]\Müller electronic\[...]\rubygems.rb SUCCESS Desired Access: Generic Read, Disposition: Open, Options: Synchronous IO Non-Alert, Non-Directory File, Attributes: N, ShareMode: Read, Write, AllocationSize: n/a, OpenResult: Opened
18:57:48,7943188 ruby.exe 15296 CloseFile C:\[...]\Müller electronic\[...]\rubygems.rb SUCCESS
18:57:48,7945545 ruby.exe 15296 CreateFile C:\[...]\Müller electronic\[...]\rubygems.rb PATH NOT FOUND Desired Access: Generic Read, Disposition: Open, Options: Synchronous IO Non-Alert, Non-Directory File, Attributes: N, ShareMode: Read, Write, AllocationSize: n/a
Here are my current environment details:
$ gem env version
3.0.3
- Windows 10 1909 x86-64
- default codepages Windows-1252 and CP-850
- Ruby 2.6.5
Updated by jeremyevans0 (Jeremy Evans) over 3 years ago
- Status changed from Open to Closed
This appears to be fixed starting in Ruby 2.7 (also works in 3.0):
D:\Евгений>C:\Ruby26-x64\bin\ruby -I D:\Евгений -e "require 'logger'"
Traceback (most recent call last):
2: from -e:1:in `<main>'
1: from C:/Ruby26-x64/lib/ruby/2.6.0/rubygems/core_ext/kernel_require.rb:54:in `require'
C:/Ruby26-x64/lib/ruby/2.6.0/rubygems/core_ext/kernel_require.rb:54:in `require': No such file or directory -- D:/Евгений/logger.rb (LoadError)
D:\Евгений>C:\Ruby27-x64\bin\ruby -I D:\Евгений -e "require 'logger'"
D:\Евгений>C:\Ruby30-x64\bin\ruby -I D:\Евгений -e "require 'logger'"
As Ruby 2.6 is in security maintenance mode, the change will not be backported.
Updated by inversion (Yura Babak) over 3 years ago
jeremyevans0 (Jeremy Evans) wrote in #note-7:
This appears to be fixed starting in Ruby 2.7 (also works in 3.0):
Still, there is a problem.
require 'bundler/setup'
fails if LOAD_PATH
or Gem.dir
contain Cyrillic chars, the error is similar to:
incompatible character encodings: ASCII-8BIT and UTF-8 (Encoding::CompatibilityError)
From the trace I have prepared the minimum reproducible case :
- Put Ruby in a location where the path will contain Cyrillic chars, like
"D:\users\киї\Ruby"
- Prepare 2 files (saved in UTF-8 encoding) somewhere in a location where the path will contain Cyrillic chars (can be near that Ruby):
https://gist.github.com/Inversion-des/75949795cc5be707c19d31901e79d1cf - Open cmd and ensure to do
chcp 1251
in the current console session. - run
"[this Ruby path]" f1.rb
You will see that the same __dir__
output is different between files (f2 is required). If you will try to run f2.rb — output will be the same as for f1. So, require_relative somehow changes the encoding here.
To emulate problems with the 'bundler/setup' there are next lines:
# fails: incompatible character encodings: Windows-1251 and UTF-8 (Encoding::CompatibilityError)
p start_with:$LOAD_PATH[0].start_with?(__dir__)
# fails: incompatible character encodings: UTF-8 and ASCII-8BIT (Encoding::CompatibilityError)
p start_with:$LOAD_PATH[0].start_with?(Gem.dir)
To see the real problem you should comment these lines and also prepare next files (I'm not sure content is important by add at least one gem there)
- Gemfile
- Gemfile.lock
And to see both problems there should also be the .bundle\config
file with a line like:
BUNDLE_PATH: "../platform/Ruby_gems"
In the bundler\settings.rb
it will use explicit_path
if the BUNDLE_PATH
defined and Bundler.rubygems.gem_dir
otherwise.
Workaround to overcome both errors you can find in the f1.rb in the related commented section:
Gem.dir.force_encoding 'UTF-8'
Gem.path.each {|path| path.force_encoding 'UTF-8' }
if $:[0].encoding.name == 'Windows-1251'
$:.each {|path| path.encode! 'UTF-8' }
$:.push '.' # somehow it helps, looks like a modification of array is needed
end
My environment:
- Windows10 Pro
- Ruby 3.0.1p64 (2021-04-05 revision 0fb782ee38) [x64-mingw32]
- Bundler version 2.2.22
- RubyGems version 3.2.22
Updated by jeremyevans0 (Jeremy Evans) over 3 years ago
- Status changed from Closed to Open
inversion (Yura Babak) wrote in #note-8:
jeremyevans0 (Jeremy Evans) wrote in #note-7:
This appears to be fixed starting in Ruby 2.7 (also works in 3.0):
Still, there is a problem.
require 'bundler/setup'
fails ifLOAD_PATH
orGem.dir
contain Cyrillic chars, the error is similar to:incompatible character encodings: ASCII-8BIT and UTF-8 (Encoding::CompatibilityError)
From the trace I have prepared the minimum reproducible case :
- Put Ruby in a location where the path will contain Cyrillic chars, like
"D:\users\киї\Ruby"
- Prepare 2 files (saved in UTF-8 encoding) somewhere in a location where the path will contain Cyrillic chars (can be near that Ruby):
https://gist.github.com/Inversion-des/75949795cc5be707c19d31901e79d1cf- Open cmd and ensure to do
chcp 1251
in the current console session.- run
"[this Ruby path]" f1.rb
I was able to reproduce the issue, but only when I installed Ruby into a path not supported by the Windows-1251
encoding:
d:\Евгений>d:\zz-können2\Ruby31-x64\bin\bundle install --local
d:/zz-können2/Ruby31-x64/lib/ruby/3.1.0/bundler.rb:94:in `expand_path': incompatible character encodings: ASCII-8BIT and UTF-8 (Encoding::CompatibilityError)
from d:/zz-können2/Ruby31-x64/lib/ruby/3.1.0/bundler.rb:94:in `expand_path'
from d:/zz-können2/Ruby31-x64/lib/ruby/3.1.0/bundler.rb:94:in `bundle_path'
from d:/zz-können2/Ruby31-x64/lib/ruby/3.1.0/bundler.rb:682:in `configure_gem_home'
from d:/zz-können2/Ruby31-x64/lib/ruby/3.1.0/bundler.rb:663:in `configure_gem_home_and_path'
from d:/zz-können2/Ruby31-x64/lib/ruby/3.1.0/bundler.rb:80:in `configure'
from d:/zz-können2/Ruby31-x64/lib/ruby/3.1.0/bundler.rb:193:in `definition'
from d:/zz-können2/Ruby31-x64/lib/ruby/3.1.0/bundler/cli/install.rb:57:in `run'
from d:/zz-können2/Ruby31-x64/lib/ruby/3.1.0/bundler/cli.rb:259:in `block in install'
from d:/zz-können2/Ruby31-x64/lib/ruby/3.1.0/bundler/settings.rb:133:in `temporary'
from d:/zz-können2/Ruby31-x64/lib/ruby/3.1.0/bundler/cli.rb:258:in `install'
from d:/zz-können2/Ruby31-x64/lib/ruby/3.1.0/bundler/vendor/thor/lib/thor/command.rb:27:in `run'
from d:/zz-können2/Ruby31-x64/lib/ruby/3.1.0/bundler/vendor/thor/lib/thor/invocation.rb:127:in `invoke_command'
from d:/zz-können2/Ruby31-x64/lib/ruby/3.1.0/bundler/vendor/thor/lib/thor.rb:392:in `dispatch'
from d:/zz-können2/Ruby31-x64/lib/ruby/3.1.0/bundler/cli.rb:30:in `dispatch'
from d:/zz-können2/Ruby31-x64/lib/ruby/3.1.0/bundler/vendor/thor/lib/thor/base.rb:485:in `start'
from d:/zz-können2/Ruby31-x64/lib/ruby/3.1.0/bundler/cli.rb:24:in `start'
from d:/zz-können2/Ruby31-x64/lib/ruby/gems/3.1.0/gems/bundler-2.3.0.dev/libexec/bundle:49:in `block in <top (required)>'
from d:/zz-können2/Ruby31-x64/lib/ruby/3.1.0/bundler/friendly_errors.rb:130:in `with_friendly_errors'
from d:/zz-können2/Ruby31-x64/lib/ruby/gems/3.1.0/gems/bundler-2.3.0.dev/libexec/bundle:37:in `<top (required)>'
from d:/zz-k?nnen2/Ruby31-x64/bin/bundle:31:in `load'
from d:/zz-k?nnen2/Ruby31-x64/bin/bundle:31:in `<main>'
Part of the underlying issue seems to be that __FILE__
and __dir__
are not UTF-8 encoded for the main script, unlike required files. I'm not sure if changing that alone will fix the issue, though.
When I run the following script (f3.rb
):
p ['__FILE__', __FILE__, __FILE__.encoding]
p ['__dir__', __dir__, __dir__.encoding]
p ['Gem.dir', Gem.dir, Gem.dir.encoding]
puts 'Gem.path'
Gem.path.each do |s|
p [s, s.encoding]
end
puts '$:'
$:.each do |s|
p [s, s.encoding]
end
I get the following when using Ruby installed in a non-ASCII path:
d:\Евгений>d:\zz-können2\Ruby31-x64\bin\ruby D:\Евгений\f3.rb
["__FILE__", "D:/\xC5\xE2\xE3\xE5\xED\xE8\xE9/f3.rb", #<Encoding:Windows-1251>]
["__dir__", "D:/\xC5\xE2\xE3\xE5\xED\xE8\xE9", #<Encoding:Windows-1251>]
["Gem.dir", "d:/zz-k\xC3\xB6nnen2/Ruby31-x64/lib/ruby/gems/3.1.0", #<Encoding:ASCII-8BIT>]
Gem.path
["C:/Users/jeremye/.gem/ruby/3.1.0", #<Encoding:UTF-8>]
["d:/zz-k\xC3\xB6nnen2/Ruby31-x64/lib/ruby/gems/3.1.0", #<Encoding:ASCII-8BIT>]
$:
["d:/zz-k\xC3\xB6nnen2/Ruby31-x64/lib/ruby/site_ruby/3.1.0", #<Encoding:ASCII-8BIT>]
["d:/zz-k\xC3\xB6nnen2/Ruby31-x64/lib/ruby/site_ruby/3.1.0/x64-ucrt", #<Encoding:ASCII-8BIT>]
["d:/zz-k\xC3\xB6nnen2/Ruby31-x64/lib/ruby/site_ruby", #<Encoding:ASCII-8BIT>]
["d:/zz-k\xC3\xB6nnen2/Ruby31-x64/lib/ruby/vendor_ruby/3.1.0", #<Encoding:ASCII-8BIT>]
["d:/zz-k\xC3\xB6nnen2/Ruby31-x64/lib/ruby/vendor_ruby/3.1.0/x64-ucrt", #<Encoding:ASCII-8BIT>]
["d:/zz-k\xC3\xB6nnen2/Ruby31-x64/lib/ruby/vendor_ruby", #<Encoding:ASCII-8BIT>]
["d:/zz-k\xC3\xB6nnen2/Ruby31-x64/lib/ruby/3.1.0", #<Encoding:ASCII-8BIT>]
["d:/zz-k\xC3\xB6nnen2/Ruby31-x64/lib/ruby/3.1.0/x64-mingw-ucrt", #<Encoding:ASCII-8BIT>]
and when installed into an ASCII path:
d:\Евгений>C:\Ruby30-x64\bin\ruby d:\Евгений\f3.rb
["__FILE__", "d:/\xC5\xE2\xE3\xE5\xED\xE8\xE9/f3.rb", #<Encoding:Windows-1251>]
["__dir__", "d:/\xC5\xE2\xE3\xE5\xED\xE8\xE9", #<Encoding:Windows-1251>]
["Gem.dir", "C:/Ruby30-x64/lib/ruby/gems/3.0.0", #<Encoding:ASCII-8BIT>]
Gem.path
["C:/Users/jeremye/.gem/ruby/3.0.0", #<Encoding:UTF-8>]
["C:/Ruby30-x64/lib/ruby/gems/3.0.0", #<Encoding:ASCII-8BIT>]
$:
["C:/Ruby30-x64/lib/ruby/site_ruby/3.0.0", #<Encoding:Windows-1251>]
["C:/Ruby30-x64/lib/ruby/site_ruby/3.0.0/x64-msvcrt", #<Encoding:Windows-1251>]
["C:/Ruby30-x64/lib/ruby/site_ruby", #<Encoding:Windows-1251>]
["C:/Ruby30-x64/lib/ruby/vendor_ruby/3.0.0", #<Encoding:Windows-1251>]
["C:/Ruby30-x64/lib/ruby/vendor_ruby/3.0.0/x64-msvcrt", #<Encoding:Windows-1251>]
["C:/Ruby30-x64/lib/ruby/vendor_ruby", #<Encoding:Windows-1251>]
["C:/Ruby30-x64/lib/ruby/3.0.0", #<Encoding:Windows-1251>]
["C:/Ruby30-x64/lib/ruby/3.0.0/x64-mingw32", #<Encoding:Windows-1251>]
It looks like the difference in the non-ASCII path case is that ASCII-8BIT
encoding is used even if the path itself is valid UTF-8. This is true even if you force a UTF-8 code page (though that does fix __FILE__
and __dir__
):
d:\Евгений>chcp 65001
Active code page: 65001
d:\Евгений>d:\zz-können2\Ruby31-x64\bin\ruby D:\Евгений\f3.rb
["__FILE__", "D:/Евгений/f3.rb", #<Encoding:UTF-8>]
["__dir__", "D:/Евгений", #<Encoding:UTF-8>]
["Gem.dir", "d:/zz-k\xC3\xB6nnen2/Ruby31-x64/lib/ruby/gems/3.1.0", #<Encoding:ASCII-8BIT>]
Gem.path
["C:/Users/jeremye/.gem/ruby/3.1.0", #<Encoding:UTF-8>]
["d:/zz-k\xC3\xB6nnen2/Ruby31-x64/lib/ruby/gems/3.1.0", #<Encoding:ASCII-8BIT>]
$:
["d:/zz-k\xC3\xB6nnen2/Ruby31-x64/lib/ruby/site_ruby/3.1.0", #<Encoding:ASCII-8BIT>]
["d:/zz-k\xC3\xB6nnen2/Ruby31-x64/lib/ruby/site_ruby/3.1.0/x64-ucrt", #<Encoding:ASCII-8BIT>]
["d:/zz-k\xC3\xB6nnen2/Ruby31-x64/lib/ruby/site_ruby", #<Encoding:ASCII-8BIT>]
["d:/zz-k\xC3\xB6nnen2/Ruby31-x64/lib/ruby/vendor_ruby/3.1.0", #<Encoding:ASCII-8BIT>]
["d:/zz-k\xC3\xB6nnen2/Ruby31-x64/lib/ruby/vendor_ruby/3.1.0/x64-ucrt", #<Encoding:ASCII-8BIT>]
["d:/zz-k\xC3\xB6nnen2/Ruby31-x64/lib/ruby/vendor_ruby", #<Encoding:ASCII-8BIT>]
["d:/zz-k\xC3\xB6nnen2/Ruby31-x64/lib/ruby/3.1.0", #<Encoding:ASCII-8BIT>]
["d:/zz-k\xC3\xB6nnen2/Ruby31-x64/lib/ruby/3.1.0/x64-mingw-ucrt", #<Encoding:ASCII-8BIT>]
Since there does appear to be an issue, I'll reopen this. Hopefully someone with more knowledge in this area can suggest a possible fix.
Updated by inversion (Yura Babak) over 3 years ago
- ruby -v changed from ruby 2.6.3p62 (2019-04-16 revision 67580) [x64-mingw32] to 3.0.1p64 (2021-04-05 revision 0fb782ee38) [x64-mingw32]