Project

General

Profile

Bug #7168

File.join trips over string encodings

Added by aparker42 (Andrew Parker) about 5 years ago. Updated about 5 years ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Target version:
-
ruby -v:
ruby 1.9.3p125 (2012-02-16 revision 34643) [x86_64-darwin11.3.0]
[ruby-core:48012]

Description

It seems like File.join is unable to handle string encodings that contain null bytes even though the string is properly encoded. This causes it to be unable to process inputs when the filenames on the system are encoded in this manner.

From an irb session:

1.9.3p125 :013 > File.join("a".encode("UTF-16LE")).encoding
 => #<Encoding:UTF-16LE> 
1.9.3p125 :014 > File.join("a".encode("UTF-16LE"), "".encode("UTF-16LE"))
ArgumentError: string contains null byte
from (irb):14:in `join'
from (irb):14
from /Users/andy/.rvm/rubies/ruby-1.9.3-p125/bin/irb:16:in `<main>'

I would expect the second command in that session to return "a/" just like File.join("a", "") does, but with the UTF-16LE encoding.

See https://groups.google.com/forum/?hl=en&fromgroups=#!topic/puppet-dev/C1YODJxd9Ws for where this came from.

Associated revisions

Revision 37207
Added by nobu (Nobuyoshi Nakada) about 5 years ago

file.c: ASCII-compatible

  • file.c (rb_file_join): path names must be ASCII-compatible. [Bug #7168]

Revision 37207
Added by nobu (Nobuyoshi Nakada) about 5 years ago

file.c: ASCII-compatible

  • file.c (rb_file_join): path names must be ASCII-compatible. [Bug #7168]

Revision 37207
Added by nobu (Nobuyoshi Nakada) about 5 years ago

file.c: ASCII-compatible

  • file.c (rb_file_join): path names must be ASCII-compatible. [Bug #7168]

Revision 37207
Added by nobu (Nobuyoshi Nakada) about 5 years ago

file.c: ASCII-compatible

  • file.c (rb_file_join): path names must be ASCII-compatible. [Bug #7168]

Revision 37212
Added by nobu (Nobuyoshi Nakada) about 5 years ago

file.c: ASCII-compatible

  • file.c (rb_file_join): check nul-byte only for strings, since FilePathStringValue() does it. [Bug #7168]

Revision 37212
Added by nobu (Nobuyoshi Nakada) about 5 years ago

file.c: ASCII-compatible

  • file.c (rb_file_join): check nul-byte only for strings, since FilePathStringValue() does it. [Bug #7168]

Revision 37212
Added by nobu (Nobuyoshi Nakada) about 5 years ago

file.c: ASCII-compatible

  • file.c (rb_file_join): check nul-byte only for strings, since FilePathStringValue() does it. [Bug #7168]

Revision 37212
Added by nobu (Nobuyoshi Nakada) about 5 years ago

file.c: ASCII-compatible

  • file.c (rb_file_join): check nul-byte only for strings, since FilePathStringValue() does it. [Bug #7168]

Revision 37216
Added by nobu (Nobuyoshi Nakada) about 5 years ago

file.c: ASCII-compatible

  • file.c (rb_file_join): need to check again after any conversion run. [Bug #7168]

Revision 37216
Added by nobu (Nobuyoshi Nakada) about 5 years ago

file.c: ASCII-compatible

  • file.c (rb_file_join): need to check again after any conversion run. [Bug #7168]

Revision 37216
Added by nobu (Nobuyoshi Nakada) about 5 years ago

file.c: ASCII-compatible

  • file.c (rb_file_join): need to check again after any conversion run. [Bug #7168]

Revision 37216
Added by nobu (Nobuyoshi Nakada) about 5 years ago

file.c: ASCII-compatible

  • file.c (rb_file_join): need to check again after any conversion run. [Bug #7168]

History

#1 Updated by nobu (Nobuyoshi Nakada) about 5 years ago

  • Status changed from Open to Closed
  • % Done changed from 0 to 100

This issue was solved with changeset r37207.
Andrew, thank you for reporting this issue.
Your contribution to Ruby is greatly appreciated.
May Ruby be with you.


file.c: ASCII-compatible

  • file.c (rb_file_join): path names must be ASCII-compatible. [Bug #7168]

#2 [ruby-core:48033] Updated by aparker42 (Andrew Parker) about 5 years ago

I don't see how that patch fixes the issue to achieve the behavior that I expected. It looks like it will just raise an explicit error instead of the ArgumentError.

For systems that have multi-byte, fixed-size encodings how is this supposed to work?

#3 [ruby-core:48047] Updated by nobu (Nobuyoshi Nakada) about 5 years ago

Use UTF-8.

I don't think Darwin uses wide chars for file systems, though.

#4 [ruby-core:48066] Updated by aparker42 (Andrew Parker) about 5 years ago

You are right that Darwin doesn't have this problem, as far as I know. As I pointed out the problem was originally found on Windows, where the filesystem uses UTF-16.

Why is the behavior that I expected (that File.join can work with any String) wrong?

Also available in: Atom PDF