Actions
Bug #10584
closedString.valid_encoding?, String.ascii_only? fails to account for BOM.
Status:
Rejected
Assignee:
-
Target version:
-
ruby -v:
ruby 2.2.0preview2 (2014-11-28 trunk 48628) [x86_64-darwin14]
Backport:
Tags:
Description
IMO:
-
A Unicode (UTF-16, UTF-32) string with a valid BOM should not be considered a valid encoding if endianness is changed.
-
A UTF-8 string with BOM should not consider the BOM as a codepoint.
> file utf-16be-file
utf-16be-file: POSIX shell script, Big-endian UTF-16 Unicode text executable
> file utf-16le-file
utf-16le-file: POSIX shell script, Little-endian UTF-16 Unicode text executable
> file utf-8-with-bom-file
utf-8-with-bom-file: POSIX shell script, UTF-8 Unicode (with BOM) text executable
> ruby -e "p File.binread('utf-16le-file').force_encoding('UTF-16BE').valid_encoding?"
true # false
> ruby -e "p File.binread('utf-16be-file').force_encoding('UTF-16LE').valid_encoding?"
true # false
> ruby -e "p File.read('utf-8-with-bom-file').ascii_only?"
false # true
> ruby -e "p File.read('utf-8-with-bom-file')[0]"
"" # '#'
No?
Files
Actions
Like0
Like0Like0Like0