Project

General

Profile

Actions

Bug #14127

closed

(CSV) generating UTF-16LE encoded file without BOM

Added by laykou (Ladislav Gallay) over 6 years ago. Updated over 5 years ago.

Status:
Rejected
Target version:
-
ruby -v:
2.4.1
[ruby-core:83865]

Description

This file should contain BOM information so that it is properly detected as UTF-16LE file.

How to generate such file:

file = CSV.generate(encoding: 'UTF-16LE') do |csv|
    csv << ['something', 'ľščťžýáíé']
end

According to file -I file.csv this file is recognized as application/octet-stream; charset=binary because it is missing the BOM information.

According to Wikipedia https://en.wikipedia.org/wiki/UTF-16 it should contain "\xFF\xFE" on the beginning of the document so that everyone knows iths UTF-16LE.

Here is someone trying to fix this in the similiar way: https://stackoverflow.com/a/22950912/1632815 I did it: manually adding that BOM information.

## Adds BOM, albeit in a somewhat hacky way.
new_html_file = File.open(foo.txt, "w:UTF-8")
new_html_file << "\xFF\xFE".force_encoding('utf-16le') + some_text.force_encoding('utf-8').encode('utf-16le')
Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0Like0Like0