Project

General

Profile

Actions

Bug #11116

closed

The spec of String#dump

Added by mame (Yusuke Endoh) about 7 years ago. Updated over 3 years ago.

Status:
Closed
Priority:
Normal
Target version:
-
ruby -v:
ruby 2.2.1p85 (2015-02-26 revision 49769) [x86_64-linux]
[ruby-core:69063]

Description

The current spec says:

 call-seq:
   str.dump   -> new_str

Produces a version of +str+ with all non-printing characters replaced by
<code>\nnn</code> notation and all special characters escaped.

  "hello \n ''".dump  #=> "\"hello \\n ''\"

\nnn must be \xnn now.

In addition, I've expected String#dump to return a string that evaluates to an original string (except singleton methods, object id, etc.) when evaled. Is this a right expectation? If so, it would be good to officially include the mention in the spec. What do you think?

--
Yusuke Endoh

Updated by mame (Yusuke Endoh) over 3 years ago

  • Status changed from Open to Closed

Committed at r66894. Closing

Updated by Eregon (Benoit Daloze) over 3 years ago

Does that preserve the encoding of the String though?

What about String#inspect, does it also eval() to itself?

Updated by mame (Yusuke Endoh) over 3 years ago

Eregon (Benoit Daloze) wrote:

Does that preserve the encoding of the String though?

The short answer: yes.

If the encoding is ASCII-compatible, it is preserved via its encoding of the resulting string.

s = "Hello こんにちは".encode("Windows-31J")
s = s.dump
puts s                 #=> "Hello \x82\xB1\x82\xF1\x82\xC9\x82\xBF\x82\xCD"
p s.encoding           #=> #<Encoding:Windows-31J>
s = eval(s)
p s.encoding           #=> #<Encoding:Windows-31J>
puts s.encode("UTF-8") #=> Hello こんにちは

If the encoding is not ASCII-compatible, the dumped string has an explicit force_encoding.

s = "Hello こんにちは".encode("UTF-16LE")
s = s.dump
puts s                 #=> "H\x00e\x00l\x00l\x00o\x00 \x00S0\x930k0a0o0".dup.force_encoding("UTF-16LE")
s = eval(s)
puts s.encode("UTF-8") #=> Hello こんにちは

I guess that there might be other subtle edge cases that I don't know. It is difficult for me to write the detailed document.

What about String#inspect, does it also eval() to itself?

No, it is not guaranteed. String#inspect is just for human, so there is no annoying hack like the above non-ASCII-compatible encoding. (I have encountered another subtle case, but I cannot remember...)

Actions

Also available in: Atom PDF