Project

General

Profile

Actions

Bug #7168

closed

File.join trips over string encodings

Added by aparker42 (Andrew Parker) over 11 years ago. Updated over 11 years ago.

Status:
Closed
Assignee:
-
Target version:
-
ruby -v:
ruby 1.9.3p125 (2012-02-16 revision 34643) [x86_64-darwin11.3.0]
Backport:
[ruby-core:48012]

Description

It seems like File.join is unable to handle string encodings that contain null bytes even though the string is properly encoded. This causes it to be unable to process inputs when the filenames on the system are encoded in this manner.

From an irb session:

1.9.3p125 :013 > File.join("a".encode("UTF-16LE")).encoding
 => #<Encoding:UTF-16LE> 
1.9.3p125 :014 > File.join("a".encode("UTF-16LE"), "".encode("UTF-16LE"))
ArgumentError: string contains null byte
from (irb):14:in `join'
from (irb):14
from /Users/andy/.rvm/rubies/ruby-1.9.3-p125/bin/irb:16:in `<main>'

I would expect the second command in that session to return "a/" just like File.join("a", "") does, but with the UTF-16LE encoding.

See https://groups.google.com/forum/?hl=en&fromgroups=#!topic/puppet-dev/C1YODJxd9Ws for where this came from.

Actions #1

Updated by nobu (Nobuyoshi Nakada) over 11 years ago

  • Status changed from Open to Closed
  • % Done changed from 0 to 100

This issue was solved with changeset r37207.
Andrew, thank you for reporting this issue.
Your contribution to Ruby is greatly appreciated.
May Ruby be with you.


file.c: ASCII-compatible

Updated by aparker42 (Andrew Parker) over 11 years ago

I don't see how that patch fixes the issue to achieve the behavior that I expected. It looks like it will just raise an explicit error instead of the ArgumentError.

For systems that have multi-byte, fixed-size encodings how is this supposed to work?

Updated by nobu (Nobuyoshi Nakada) over 11 years ago

Use UTF-8.

I don't think Darwin uses wide chars for file systems, though.

Updated by aparker42 (Andrew Parker) over 11 years ago

You are right that Darwin doesn't have this problem, as far as I know. As I pointed out the problem was originally found on Windows, where the filesystem uses UTF-16.

Why is the behavior that I expected (that File.join can work with any String) wrong?

Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0