Bug #8784
closedCSV - Empty fields are discarded when col_sep is a space
Description
When using space as column delimiter, empty fields are discarded.
With other delimiters, like comma, empty fields are correctly retrieved.
The following code reproduces the problem:
#!/usr/bin/env ruby
require 'csv'
print(CSV.parse('2009,2,3,8,0,,30.1,,'))
puts
print(CSV.parse('2009 2 3 8 0  30.1  ', col_sep: ' '))
puts
Expected output
[["2009", "2", "3", "8", "0", nil, "30.1", nil, nil]]
[["2009", "2", "3", "8", "0", nil, "30.1", nil, nil]]
Current output
[["2009", "2", "3", "8", "0", nil, "30.1", nil, nil]]
[["2009", "2", "3", "8", "0", "30.1", nil]]
Files
        
           Updated by grim7reaper (Sylvain Laperche) over 11 years ago
          Updated by grim7reaper (Sylvain Laperche) over 11 years ago
          
          
        
        
      
      - File bug_csv_col_sep.rb bug_csv_col_sep.rb added
The bug is still present in Ruby 2.1.0
% ruby -v
ruby 2.1.0p0 (2013-12-25 revision 44422) [x86_64-linux]
The attached script prints:
[["2009", "2", "3", "8", "0", nil, "30.1", nil, nil]]
[["2009", "2", "3", "8", "0", "30.1", nil]]
instead of:
[["2009", "2", "3", "8", "0", nil, "30.1", nil, nil]]
[["2009", "2", "3", "8", "0", nil, "30.1", nil, nil]]
        
           Updated by hsbt (Hiroshi SHIBATA) about 11 years ago
          Updated by hsbt (Hiroshi SHIBATA) about 11 years ago
          
          
        
        
      
      - Status changed from Open to Assigned
- Assignee set to JEG2 (James Gray)
        
           Updated by slavcho42 (Slavcho Ivanov) over 8 years ago
          Updated by slavcho42 (Slavcho Ivanov) over 8 years ago
          
          
        
        
      
      The problem is in the String::split function.
It turns out it is a feature, not a bug: http://ruby-doc.org/core-2.4.0/String.html#method-i-split .
"If pattern is a single space, str is split on whitespace, with leading whitespace and runs of contiguous whitespace characters ignored."
This should fix the problem in csv.rb - line 1834:
old:
parts =  parse.split(@col_sep, -1)
new:
sep = @col_sep == ' ' ? / / : @col_sep
parts =  parse.split(sep, -1)
Hope this helps.
        
           Updated by mame (Yusuke Endoh) over 7 years ago
          Updated by mame (Yusuke Endoh) over 7 years ago
          
          
        
        
      
      - Assignee changed from JEG2 (James Gray) to kou (Kouhei Sutou)
- Backport deleted (1.9.3: UNKNOWN, 2.0.0: UNKNOWN)
@kou (Kouhei Sutou), could you check this ticket?
        
           Updated by kou (Kouhei Sutou) over 7 years ago
          Updated by kou (Kouhei Sutou) over 7 years ago
          
          
        
        
      
      - Status changed from Assigned to Closed
Thanks for your report.
I've fixed it at the master: https://github.com/ruby/csv/commit/ba560e407a152afffea589d832084c249471eeb6