Project

General

Profile

Feature #12839

CSV - Give not nil but empty strings for empty fields

Added by 5.5 (5 5) almost 2 years ago. Updated 3 months ago.

Status:
Closed
Priority:
Normal
Target version:
-
[ruby-core:77623]

Description

The CSV parser gives nil for empty fields.

require "csv"

CSV.parse(%|,""|) #=> [[nil, ""]]

The above behavior maybe be suitable for certain programmers, but I hope to get [["", ""]].

So I had used to write the following code reluctantly till Ruby 2.1:

require "csv"

CSV.parse(%|,""|, converters: lambda{|v| v || ""})
#=> [["", ""]]

It is wasteful, but certainly works for my purpose.

However, because of #11126, the above code does not work from Ruby 2.2.
(Converters are not called for nil)

I merely want an option, which makes the CSV parser give empty strings for empty fields.

Namely,

require "csv"

CSV.parse(%|,""|, string: true) #=> [["", ""]]

Related issues

Related to Ruby trunk - Bug #11126: CSV field converters doesn't attempt to convert nil value.Closed

History

#1 Updated by shyouhei (Shyouhei Urabe) over 1 year ago

  • Related to Bug #11126: CSV field converters doesn't attempt to convert nil value. added

#2 [ruby-core:78781] Updated by shyouhei (Shyouhei Urabe) over 1 year ago

  • Assignee set to JEG2 (James Gray)

Currently no active developers are there for CSV.

I heard the distinguish between silent and sound empty cells are intentional. However I have no idea if converter issue is a bug or design.

#3 [ruby-core:85819] Updated by hsbt (Hiroshi SHIBATA) 5 months ago

  • Assignee changed from JEG2 (James Gray) to kou (Kouhei Sutou)
  • Status changed from Open to Assigned

#4 [ruby-core:86072] Updated by kou (Kouhei Sutou) 4 months ago

  • Status changed from Assigned to Closed

This code works again with the latest csv.

require "csv"

CSV.parse(%|,""|, converters: lambda{|v| v || ""})
#=> [["", ""]]

#5 [ruby-core:86473] Updated by 5.5 (5 5) 3 months ago

Thank you very much for fixing #11126.

But I think that the status of my ticket (#12839) should be not Closed but Rejected.
Because my hope is that the CSV parser gives empty strings for empty fields.

Using converters slows the speed.

gem "benchmark-ips"
require "benchmark/ips"
require "csv"

csv_text = <<EOT
foo,bar,"",baz
hoge,"",temo,""
roo,goo,por,kosh
EOT

conv = ->(s){ s || "" }

Benchmark.ips 20 do |r|
  r.report "without converter" do
    CSV.parse csv_text
  end

  r.report "with converter" do
    CSV.parse csv_text, converters: conv
  end

  r.compare!
end

# Comparison:
#   without converter:     9968.4 i/s
#      with converter:     8590.4 i/s - 1.16x  slower

#6 [ruby-core:86484] Updated by kou (Kouhei Sutou) 3 months ago

I added :nil_value option as a shortcut:

require "csv"

p CSV.parse(',"",a')                # => [[nil, "", "a"]]
p CSV.parse(',"",a', nil_value: "") # => [["", "", "a"]]

But it's not so fast:

require "csv"

require "benchmark/ips"

csv_text = <<CSV
foo,bar,,baz
hoge,,temo,
roo,goo,por,kosh
CSV

convert_nil = ->(s) {s || ""}

Benchmark.ips do |r|
  r.report "not convert" do
    CSV.parse(csv_text)
  end

  r.report "converter" do
    CSV.parse(csv_text, converters: convert_nil)
  end

  r.report "option" do
    CSV.parse(csv_text, nil_value: "")
  end

  r.compare!
end
Warming up --------------------------------------
         not convert   742.000  i/100ms
           converter   620.000  i/100ms
              option   672.000  i/100ms
Calculating -------------------------------------
         not convert      7.480k (± 1.8%) i/s -     37.842k in   5.061095s
           converter      6.289k (± 0.5%) i/s -     31.620k in   5.028042s
              option      6.697k (± 3.7%) i/s -     33.600k in   5.025273s

Comparison:
         not convert:     7479.8 i/s
              option:     6696.8 i/s - 1.12x  slower
           converter:     6288.9 i/s - 1.19x  slower

Also available in: Atom PDF