Bug #4069: String#parse_csv fails to parse "\r" character embedded string - Ruby - Ruby Issue Tracking System

Actions

Copy link

Bug #4069

closed

String#parse_csv fails to parse "\r" character embedded string

Added by phasis68 (Heesob Park) over 14 years ago. Updated over 14 years ago.

Status:

Rejected

Assignee:

JEG2 (James Gray)

Target version:

2.0.0

ruby -v:

ruby 1.9.3dev (2010-11-18 trunk 29823) [i386-mswin32_90]

Backport:

[ruby-core:33247]

Description

=begin
C:\work>ruby -rcsv -ve 'p ["aa\rbb"].to_csv.parse_csv'
ruby 1.9.3dev (2010-11-18 trunk 29823) [i386-mswin32_90]
c:/usr/lib/ruby/1.9.1/csv.rb:1914:in block in shift': Unclosed quoted field on line 1. (CSV::MalformedCSVError) from c:/usr/lib/ruby/1.9.1/csv.rb:1831:in loop'
from c:/usr/lib/ruby/1.9.1/csv.rb:1831:in shift' from c:/usr/lib/ruby/1.9.1/csv.rb:1390:in parse_line'
from c:/usr/lib/ruby/1.9.1/csv.rb:2341:in parse_csv' from -e:1:in '
=end

Actions

Copy link

Updated by ender672 (Timothy Elliott) over 14 years ago

=begin
["aa\rbb"].to_csv results in the string ""aa\rbb"\n"

When you don't specify a row separator the ruby CSV library makes a guess by searching for the first occurrence of \r or \n.

In the case of ""aa\rbb"\n" it encounters the \r and assumes that it is your row separator. In order to point it to the correct row separator, you have to supply the option :row_sep => "\n" :

$ ruby -rcsv -ve 'p ["aa\rbb"].to_csv.parse_csv(:row_sep => "\n")'
ruby 1.9.3dev (2010-11-19 trunk 29830) [x86_64-linux]
["aa\rbb"]

=end

Actions

Copy link

Updated by JEG2 (James Gray) over 14 years ago

Status changed from Open to Rejected
Assignee set to JEG2 (James Gray)

=begin
Sorry, not sure how I missed this ticket. As Timothy says, this is intended documented behavior:

 # <b><tt>:row_sep</tt></b>::            The String appended to the end of each
 #                                       row.  This can be set to the special
 #                                       <tt>:auto</tt> setting, which requests
 #                                       that CSV automatically discover this
 #                                       from the data.  Auto-discovery reads
 #                                       ahead in the data looking for the next
 #                                       <tt>"\r\n"</tt>, <tt>"\n"</tt>, or
 #                                       <tt>"\r"</tt> sequence.  A sequence
 #                                       will be selected even if it occurs in
 #                                       a quoted field, assuming that you
 #                                       would have the same line endings
 #                                       there.  If none of those sequences is
 #                                       found, +data+ is <tt>ARGF</tt>,
 #                                       <tt>STDIN</tt>, <tt>STDOUT</tt>, or
 #                                       <tt>STDERR</tt>, or the stream is only
 #                                       available for output, the default
 #                                       <tt>$INPUT_RECORD_SEPARATOR</tt>
 #                                       (<tt>$/</tt>) is used.  Obviously,
 #                                       discovery takes a little time.  Set
 #                                       manually if speed is important.  Also
 #                                       note that IO objects should be opened
 #                                       in binary mode on Windows if this
 #                                       feature will be used as the
 #                                       line-ending translation can cause
 #                                       problems with resetting the document
 #                                       position to where it was before the
 #                                       read ahead. This String will be
 #                                       transcoded into the data's Encoding
 #                                       before parsing.

=end

Actions

Copy link

Also available in: Atom PDF

Like0

Like0Like0

Project

General

Profile

Ruby

Tags

Custom queries

Bug #4069

String#parse_csv fails to parse "\r" character embedded string

Updated by ender672 (Timothy Elliott) over 14 years ago

Updated by JEG2 (James Gray) over 14 years ago