Bug #11116
closedThe spec of String#dump
Description
The current spec says:
call-seq:
str.dump -> new_str
Produces a version of +str+ with all non-printing characters replaced by
<code>\nnn</code> notation and all special characters escaped.
"hello \n ''".dump #=> "\"hello \\n ''\"
\nnn
must be \xnn
now.
In addition, I've expected String#dump to return a string that evaluates to an original string (except singleton methods, object id, etc.) when eval
ed. Is this a right expectation? If so, it would be good to officially include the mention in the spec. What do you think?
--
Yusuke Endoh mame@ruby-lang.org
Updated by mame (Yusuke Endoh) over 4 years ago
- Status changed from Open to Closed
Committed at r66894. Closing
Updated by Eregon (Benoit Daloze) over 4 years ago
Does that preserve the encoding of the String though?
What about String#inspect, does it also eval() to itself?
Updated by mame (Yusuke Endoh) over 4 years ago
Eregon (Benoit Daloze) wrote:
Does that preserve the encoding of the String though?
The short answer: yes.
If the encoding is ASCII-compatible, it is preserved via its encoding of the resulting string.
s = "Hello こんにちは".encode("Windows-31J")
s = s.dump
puts s #=> "Hello \x82\xB1\x82\xF1\x82\xC9\x82\xBF\x82\xCD"
p s.encoding #=> #<Encoding:Windows-31J>
s = eval(s)
p s.encoding #=> #<Encoding:Windows-31J>
puts s.encode("UTF-8") #=> Hello こんにちは
If the encoding is not ASCII-compatible, the dumped string has an explicit force_encoding
.
s = "Hello こんにちは".encode("UTF-16LE")
s = s.dump
puts s #=> "H\x00e\x00l\x00l\x00o\x00 \x00S0\x930k0a0o0".dup.force_encoding("UTF-16LE")
s = eval(s)
puts s.encode("UTF-8") #=> Hello こんにちは
I guess that there might be other subtle edge cases that I don't know. It is difficult for me to write the detailed document.
What about String#inspect, does it also eval() to itself?
No, it is not guaranteed. String#inspect
is just for human, so there is no annoying hack like the above non-ASCII-compatible encoding. (I have encountered another subtle case, but I cannot remember...)