File.join trips over string encodings
It seems like
File.join is unable to handle string encodings that contain null bytes even though the string is properly encoded. This causes it to be unable to process inputs when the filenames on the system are encoded in this manner.
From an irb session:
1.9.3p125 :013 > File.join("a".encode("UTF-16LE")).encoding => #<Encoding:UTF-16LE> 1.9.3p125 :014 > File.join("a".encode("UTF-16LE"), "".encode("UTF-16LE")) ArgumentError: string contains null byte from (irb):14:in `join' from (irb):14 from /Users/andy/.rvm/rubies/ruby-1.9.3-p125/bin/irb:16:in `<main>'
I would expect the second command in that session to return "a/" just like
File.join("a", "") does, but with the UTF-16LE encoding.
See https://groups.google.com/forum/?hl=en&fromgroups=#!topic/puppet-dev/C1YODJxd9Ws for where this came from.
- file.c (rb_file_join): path names must be ASCII-compatible. [Bug #7168]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@37207 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
- file.c (rb_file_join): check nul-byte only for strings, since FilePathStringValue() does it. [Bug #7168]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@37212 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
- file.c (rb_file_join): need to check again after any conversion run. [Bug #7168]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@37216 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
#1 Updated by nobu (Nobuyoshi Nakada) about 6 years ago
- Status changed from Open to Closed
- % Done changed from 0 to 100
#2 [ruby-core:48033] Updated by aparker42 (Andrew Parker) about 6 years ago
I don't see how that patch fixes the issue to achieve the behavior that I expected. It looks like it will just raise an explicit error instead of the ArgumentError.
For systems that have multi-byte, fixed-size encodings how is this supposed to work?
#4 [ruby-core:48066] Updated by aparker42 (Andrew Parker) almost 6 years ago
You are right that Darwin doesn't have this problem, as far as I know. As I pointed out the problem was originally found on Windows, where the filesystem uses UTF-16.
Why is the behavior that I expected (that File.join can work with any String) wrong?