Project

General

Profile

Backport #6377

$LOADED_FEATURES entry via YAML is binary data?

Added by trans (Thomas Sawyer) over 7 years ago. Updated almost 7 years ago.

Status:
Closed
Priority:
Normal
[ruby-core:44750]

Description

=begin
While working with $LOADED_FEATURES, came across this odd result:

trans@logisys:courtier$ irb

irb(main):001:0> require 'yaml'

irb(main):002:0> puts $LOADED_FEATURES.join("\n")
enumerator.so
/home/trans/.local/lib/ry/rubies/1.9.3-p125/lib/ruby/1.9.1/x86_64-linux/enc/encdb.so
/home/trans/.local/lib/ry/rubies/1.9.3-p125/lib/ruby/1.9.1/x86_64-linux/enc/trans/transdb.so
/home/trans/.local/lib/ry/rubies/1.9.3-p125/lib/ruby/1.9.1/rubygems/defaults.rb
...

irb(main):003:0> y $LOADED_FEATURES


  • enumerator.so
  • !binary |- L2hvbWUvdHJhbnMvLmxvY2FsL2xpYi9yeS9ydWJpZXMvMS45LjMtcDEyNS9s aWIvcnVieS8xLjkuMS94ODZfNjQtbGludXgvZW5jL2VuY2RiLnNv
  • /home/trans/.local/lib/ry/rubies/1.9.3-p125/lib/ruby/1.9.1/x86_64-linux/enc/trans/transdb.so
  • /home/trans/.local/lib/ry/rubies/1.9.3-p125/lib/ruby/1.9.1/rubygems/defaults.rb ...

=end

Associated revisions

Revision c68d2996
Added by nobu (Nobuyoshi Nakada) about 7 years ago

load.c: keep encoding of feature name

  • file.c (rb_find_file_ext_safe, rb_find_file_safe): default to US-ASCII for encdb and transdb.
  • load.c (search_required): keep encoding of feature name. set loading path to filesystem encoding. [Bug #6377][ruby-core:44750]
  • ruby.c (add_modules, require_libraries): assume default external encoding as well as ARGV.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@36800 b2dd03c8-39d4-4d8f-98ff-823fe69b080e

Revision 36800
Added by nobu (Nobuyoshi Nakada) about 7 years ago

load.c: keep encoding of feature name

  • file.c (rb_find_file_ext_safe, rb_find_file_safe): default to US-ASCII for encdb and transdb.
  • load.c (search_required): keep encoding of feature name. set loading path to filesystem encoding. [Bug #6377][ruby-core:44750]
  • ruby.c (add_modules, require_libraries): assume default external encoding as well as ARGV.

Revision 36800
Added by nobu (Nobuyoshi Nakada) about 7 years ago

load.c: keep encoding of feature name

  • file.c (rb_find_file_ext_safe, rb_find_file_safe): default to US-ASCII for encdb and transdb.
  • load.c (search_required): keep encoding of feature name. set loading path to filesystem encoding. [Bug #6377][ruby-core:44750]
  • ruby.c (add_modules, require_libraries): assume default external encoding as well as ARGV.

Revision 36800
Added by nobu (Nobuyoshi Nakada) about 7 years ago

load.c: keep encoding of feature name

  • file.c (rb_find_file_ext_safe, rb_find_file_safe): default to US-ASCII for encdb and transdb.
  • load.c (search_required): keep encoding of feature name. set loading path to filesystem encoding. [Bug #6377][ruby-core:44750]
  • ruby.c (add_modules, require_libraries): assume default external encoding as well as ARGV.

Revision 36800
Added by nobu (Nobuyoshi Nakada) about 7 years ago

load.c: keep encoding of feature name

  • file.c (rb_find_file_ext_safe, rb_find_file_safe): default to US-ASCII for encdb and transdb.
  • load.c (search_required): keep encoding of feature name. set loading path to filesystem encoding. [Bug #6377][ruby-core:44750]
  • ruby.c (add_modules, require_libraries): assume default external encoding as well as ARGV.

Revision 36800
Added by nobu (Nobuyoshi Nakada) about 7 years ago

load.c: keep encoding of feature name

  • file.c (rb_find_file_ext_safe, rb_find_file_safe): default to US-ASCII for encdb and transdb.
  • load.c (search_required): keep encoding of feature name. set loading path to filesystem encoding. [Bug #6377][ruby-core:44750]
  • ruby.c (add_modules, require_libraries): assume default external encoding as well as ARGV.

Revision 36800
Added by nobu (Nobuyoshi Nakada) about 7 years ago

load.c: keep encoding of feature name

  • file.c (rb_find_file_ext_safe, rb_find_file_safe): default to US-ASCII for encdb and transdb.
  • load.c (search_required): keep encoding of feature name. set loading path to filesystem encoding. [Bug #6377][ruby-core:44750]
  • ruby.c (add_modules, require_libraries): assume default external encoding as well as ARGV.

Revision e9ff4a22
Added by usa (Usaku NAKAMURA) almost 7 years ago

merge revision(s) 36800: [Backport #6377]

  • file.c (rb_find_file_ext_safe, rb_find_file_safe): default to
    US-ASCII for encdb and transdb.

  • load.c (search_required): keep encoding of feature name. set
    loading path to filesystem encoding. [Bug #6377][ruby-core:44750]

  • ruby.c (add_modules, require_libraries): assume default external
    encoding as well as ARGV.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/branches/ruby_1_9_3@37209 b2dd03c8-39d4-4d8f-98ff-823fe69b080e

Revision 37209
Added by usa (Usaku NAKAMURA) almost 7 years ago

merge revision(s) 36800: [Backport #6377]

  • file.c (rb_find_file_ext_safe, rb_find_file_safe): default to
    US-ASCII for encdb and transdb.

  • load.c (search_required): keep encoding of feature name. set
    loading path to filesystem encoding. [Bug #6377][ruby-core:44750]

  • ruby.c (add_modules, require_libraries): assume default external
    encoding as well as ARGV.

History

Updated by nobu (Nobuyoshi Nakada) over 7 years ago

  • Status changed from Open to Assigned
  • Assignee set to tenderlovemaking (Aaron Patterson)

Sounds like psych deals with ASCII-8BIT strings as binary data always, even if 7bit only.

Updated by tenderlovemaking (Aaron Patterson) about 7 years ago

I'm not sure how or if I should fix this. There are two problems: 1) we lose encoding information, and 2) how do we decide what to consider "binary" or not.

For #1, if we treat 7bit only ascii strings as "non-binary", it means that when we load the data back in, the string will be tagged as UTF-8 (since "raw" YAML strings are unicode). e.g. today this test passes, but if we treat 7bit ascii strings as non-binary, it will fail:

s = "hello".encode('ASCII-8BIT')
assert_equal s.encoding, YAML.load(YAML.dump(s)).encoding

For #2, I'm not sure how we decide what is binary and what is not. Should strings that contain null bytes be considered binary? If so, we can't use the ascii_only? method:

"\0".ascii_only? # => true

Given the data loss from #1, and the hardship of #2, I don't think Psych should change. I'm open to suggestions for dealing with these problems.

Nobu: Why are the paths on LOADED_FEATURES encoded as ASCII-8BIT? Shouldn't those paths be tagged with the filesystem encoding?

Updated by tenderlovemaking (Aaron Patterson) about 7 years ago

  • Assignee changed from tenderlovemaking (Aaron Patterson) to nobu (Nobuyoshi Nakada)

Nobu, do you know why the paths on LOADED_FEATURES are encoded as ASCII-8BIT? Shouldn't they be tagged with the filesystem encoding? Thanks.

#4

Updated by nobu (Nobuyoshi Nakada) about 7 years ago

  • Status changed from Assigned to Closed
  • % Done changed from 0 to 100

This issue was solved with changeset r36800.
Thomas, thank you for reporting this issue.
Your contribution to Ruby is greatly appreciated.
May Ruby be with you.


load.c: keep encoding of feature name

  • file.c (rb_find_file_ext_safe, rb_find_file_safe): default to US-ASCII for encdb and transdb.
  • load.c (search_required): keep encoding of feature name. set loading path to filesystem encoding. [Bug #6377][ruby-core:44750]
  • ruby.c (add_modules, require_libraries): assume default external encoding as well as ARGV.
#5

Updated by nobu (Nobuyoshi Nakada) about 7 years ago

  • Tracker changed from Bug to Backport
  • Project changed from Ruby master to Backport193
  • Category deleted (lib)
  • Status changed from Closed to Assigned
  • Assignee changed from nobu (Nobuyoshi Nakada) to usa (Usaku NAKAMURA)
  • Target version deleted (1.9.3)
#6

Updated by usa (Usaku NAKAMURA) almost 7 years ago

  • Status changed from Assigned to Closed

This issue was solved with changeset r37209.
Thomas, thank you for reporting this issue.
Your contribution to Ruby is greatly appreciated.
May Ruby be with you.


merge revision(s) 36800: [Backport #6377]

  • file.c (rb_find_file_ext_safe, rb_find_file_safe): default to
    US-ASCII for encdb and transdb.

  • load.c (search_required): keep encoding of feature name. set
    loading path to filesystem encoding. [Bug #6377][ruby-core:44750]

  • ruby.c (add_modules, require_libraries): assume default external
    encoding as well as ARGV.

Also available in: Atom PDF