Actions
Bug #13949
closedString#unpack with 'M' directive can create strings with wrong code range
Status:
Closed
Assignee:
-
Target version:
-
ruby -v:
ruby 2.4.2p198 (2017-09-14 revision 59899) [x86_64-linux]
Backport:
Description
I've noticed that String#unpack
with the 'M'
directive can create strings that should be CR_7BIT
as CR_VALID
. The issue appears to have been introduced in r30542, which assumes that all ASCII-8BIT
strings must be CR_VALID
. It's possible this was correct back during Ruby 1.9.3 development and just wasn't updated. I'm not familiar enough with the history to tell.
A simple reproduction showing the issue is:
res = '0123456789=\n'.unpack('M').first
p res
p res.encoding
p res.bytes
p res.ascii_only?
puts
packed = res.bytes.pack('c*')
p packed
p packed.encoding
p packed.bytes
p packed.ascii_only?
This yields the following output:
"0123456789=\\n"
#<Encoding:ASCII-8BIT>
[48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 61, 92, 110]
false
"0123456789=\\n"
#<Encoding:ASCII-8BIT>
[48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 61, 92, 110]
true
Both strings have exactly the same contents with the same encoding. But, depending on how you construct them, one is consider to be CR_7BIT
value (indicated by the String#ascii_only?
output), and one is considered to be CR_VALID
. I believe CR_7BIT
is the correct code range value in this situation.
Actions
Like0
Like0Like0Like0Like0Like0Like0