Feature #12839
closedCSV - Give not nil but empty strings for empty fields
Description
The CSV parser gives nil for empty fields.
require "csv"
CSV.parse(%|,""|) #=> [[nil, ""]]
The above behavior maybe be suitable for certain programmers, but I hope to get [["", ""]]
.
So I had used to write the following code reluctantly till Ruby 2.1:
require "csv"
CSV.parse(%|,""|, converters: lambda{|v| v || ""})
#=> [["", ""]]
It is wasteful, but certainly works for my purpose.
However, because of #11126, the above code does not work from Ruby 2.2.
(Converters are not called for nil)
I merely want an option, which makes the CSV parser give empty strings for empty fields.
Namely,
require "csv"
CSV.parse(%|,""|, string: true) #=> [["", ""]]
Updated by shyouhei (Shyouhei Urabe) almost 8 years ago
- Related to Bug #11126: CSV field converters doesn't attempt to convert nil value. added
Updated by shyouhei (Shyouhei Urabe) almost 8 years ago
- Assignee set to JEG2 (James Gray)
Currently no active developers are there for CSV.
I heard the distinguish between silent and sound empty cells are intentional. However I have no idea if converter issue is a bug or design.
Updated by hsbt (Hiroshi SHIBATA) almost 7 years ago
- Status changed from Open to Assigned
- Assignee changed from JEG2 (James Gray) to kou (Kouhei Sutou)
Updated by kou (Kouhei Sutou) over 6 years ago
- Status changed from Assigned to Closed
This code works again with the latest csv.
require "csv"
CSV.parse(%|,""|, converters: lambda{|v| v || ""})
#=> [["", ""]]
Updated by 5.5 (5 5) over 6 years ago
Thank you very much for fixing #11126.
But I think that the status of my ticket (#12839) should be not Closed but Rejected.
Because my hope is that the CSV parser gives empty strings for empty fields.
Using converters slows the speed.
gem "benchmark-ips"
require "benchmark/ips"
require "csv"
csv_text = <<EOT
foo,bar,"",baz
hoge,"",temo,""
roo,goo,por,kosh
EOT
conv = ->(s){ s || "" }
Benchmark.ips 20 do |r|
r.report "without converter" do
CSV.parse csv_text
end
r.report "with converter" do
CSV.parse csv_text, converters: conv
end
r.compare!
end
# Comparison:
# without converter: 9968.4 i/s
# with converter: 8590.4 i/s - 1.16x slower
Updated by kou (Kouhei Sutou) over 6 years ago
I added :nil_value
option as a shortcut:
require "csv"
p CSV.parse(',"",a') # => [[nil, "", "a"]]
p CSV.parse(',"",a', nil_value: "") # => [["", "", "a"]]
But it's not so fast:
require "csv"
require "benchmark/ips"
csv_text = <<CSV
foo,bar,,baz
hoge,,temo,
roo,goo,por,kosh
CSV
convert_nil = ->(s) {s || ""}
Benchmark.ips do |r|
r.report "not convert" do
CSV.parse(csv_text)
end
r.report "converter" do
CSV.parse(csv_text, converters: convert_nil)
end
r.report "option" do
CSV.parse(csv_text, nil_value: "")
end
r.compare!
end
Warming up --------------------------------------
not convert 742.000 i/100ms
converter 620.000 i/100ms
option 672.000 i/100ms
Calculating -------------------------------------
not convert 7.480k (± 1.8%) i/s - 37.842k in 5.061095s
converter 6.289k (± 0.5%) i/s - 31.620k in 5.028042s
option 6.697k (± 3.7%) i/s - 33.600k in 5.025273s
Comparison:
not convert: 7479.8 i/s
option: 6696.8 i/s - 1.12x slower
converter: 6288.9 i/s - 1.19x slower