Project

General

Profile

Actions

Bug #15044

closed

ENV encoding not UTF-8 by default

Added by lowang (Przemyslaw Wroblewski) over 6 years ago. Updated about 5 years ago.

Status:
Closed
Assignee:
-
Target version:
-
ruby -v:
ruby 2.5.1p57 (2018-03-29 revision 63029) [x86_64-linux]
[ruby-core:88734]

Description

$ irb
2.5.1 :001 > 'secret'.encoding
 => #<Encoding:UTF-8>
2.5.1 :002 > ENV['PASS'] = 'secret'; ENV['PASS'].encoding
 => #<Encoding:US-ASCII>
2.5.1 :009 > ENV['PASS'] = 'Ł'
 => "\u0141"
2.5.1 :010 > ENV['PASS'].encoding
 => #<Encoding:ASCII-8BIT>

I would expect all encodings to be UTF-8 at all times

Updated by shevegen (Robert A. Heiler) over 6 years ago

If I put this into a .rb file:

puts 'secret'.encoding
ENV['PASS'] = 'secret'
puts ENV['PASS'].encoding

On my system I get these two Strings output:

UTF-8
ISO-8859-1

My environment is, aka my current locale, iso-8859-1, so the results that
I get seem correct. I can change the UTF-8 default encoding if I use a
shebang line in the .rb file, which I normally do, so all my encodings are
the same (ISO-8859-1; regexes used to behave a bit oddly sometimes but I
am not sure if that has changed or not).

I think ENV behaves a litle bit differently upon an
assignment.

If I use a shebang line in a .rb file that includes the above unicode
character (this weird L), then all string encodings in that .rb file
are also ISO-8859-1, so I am not sure if there is any bug at all.
It may be more related to IRB perhaps? I skipped testing on IRB mostly
because .rb files have a "higher weight" than code put through IRB.

The documentation does not mention what happens with encodings when
these are assigned to an ENV key, though:

https://ruby-doc.org/core-2.5.1/ENV.html

Perhaps it has more to do with IRB, in which case it could be added
there:

http://ruby-doc.org/stdlib-2.5.1/libdoc/irb/rdoc/IRB.html

And of course it may be that there is indeed a bug. You can try to
test with a standalone .rb file though and, if necessary, with a
specific shebang comment.

Updated by mame (Yusuke Endoh) about 5 years ago

  • Status changed from Open to Closed

It is intentional according to naruse. The encoding of ENV depends on the environment variable LANG.

Updated by naruse (Yui NARUSE) about 5 years ago

The assigned value to ENV are stored in the process's environment variable.

The encoding of ENV[key] is set as locale.
You can get the locale encoding by Encoding.find("locale") which is decided based on Encoding.locale_charmap which is affected by ENV["LANG"] and ENV["LC_ALL"].

Note that ENV["PATH"] is returned as filesystem encoding but it is the same as locale encoding on Unix.

Actions

Also available in: Atom PDF

Like0
Like0Like0Like0