Project

General

Profile

Bug #9766

Add force_encoding option to csv

Added by dtaniwaki (DAISUKE TANIWAKI) about 6 years ago. Updated over 5 years ago.

Status:
Closed
Priority:
Normal
Target version:
ruby -v:
all
[ruby-core:62113]

Description

Hi there,

I have a trouble when I use csv#generate with encoding 'Shift-JIS' option. I investigated it for a long time and found it is caused by compatibility within "UTF-8" and "Shift-JIS". Since "Shift-JIS" can be converted to UTF-8, a row with UTF-8 strings added to the csv instance makes the encoding of whole rows UTF-8.

Take a look at the code below.
https://github.com/dtaniwaki/ruby/blob/trunk/lib/csv.rb#L1658

Here's the code example.

irb(main):002:0> s = generate(encoding: 'SJIS') do |csv|
    csv << ['あ']
  end
=> ["あ"]

irb(main):003:0> s
=> "あ\n"

irb(main):004:0> s.encoding
=> #<Encoding:UTF-8>

I was intended to make SJIS encoded csv, but the result was UTF-8 csv. I think everyone think it should generate Shift-JIS encoded csv string, so could you consider to merge the change attached to this issue?

The expected result is here.

irb(main):002:0> s = generate(encoding: 'SJIS', force_encoding: true) do |csv|
    csv << ['あ']
  end
=> ["あ"]

irb(main):003:0> s
=> "\x{E381}\x82\n"

irb(main):004:0> s.encoding
=> #<Encoding:Windows-31J>

Files

csv.rb.diff (1.41 KB) csv.rb.diff Diff from revision bdeedccc5fb9131cff58cffd3428d30117bc0e74 in trunk dtaniwaki (DAISUKE TANIWAKI), 04/21/2014 07:36 AM
0001-csv.rb-honor-encoding-option.patch (2.64 KB) 0001-csv.rb-honor-encoding-option.patch nobu (Nobuyoshi Nakada), 04/22/2014 03:54 AM

Updated by dtaniwaki (DAISUKE TANIWAKI) about 6 years ago

The method to reproduce this should be CSV.generate in the code blocks.

I wish it could be merged in all the versions of ruby.

Updated by nobu (Nobuyoshi Nakada) about 6 years ago

I'd rather think it a bug.

Updated by nobu (Nobuyoshi Nakada) almost 6 years ago

  • Status changed from Assigned to Closed
  • % Done changed from 0 to 100

Applied in changeset r46391.


csv.rb: honor encoding option

  • lib/csv.rb (CSV#<<): honor explicity given encoding. based on the patch by DAISUKE TANIWAKI at [ruby-core:62113]. [Bug #9766]

Updated by nagachika (Tomoyuki Chikanaga) over 5 years ago

  • Backport changed from 2.0.0: REQUIRED, 2.1: REQUIRED to 2.0.0: REQUIRED, 2.1: DONE

r46391 and r46395 were backported into ruby_2_1 branch at r47586.

Updated by usa (Usaku NAKAMURA) over 5 years ago

  • Backport changed from 2.0.0: REQUIRED, 2.1: DONE to 2.0.0: DONE, 2.1: DONE

backported into ruby_2_0_0 at r47608.

Also available in: Atom PDF