Bug #8405

CSV module - improper regexp for escaping special characters

Added by David Unric 11 months ago. Updated 24 days ago.

[ruby-core:54986]
Status:Closed
Priority:Normal
Assignee:James Gray
Category:lib
Target version:current: 2.2.0
ruby -v:2.0.0p0 Backport:1.9.3: REQUIRED, 2.0.0: DONE, 2.1: REQUIRED

Description

=begin
There seems to be bug in csv.rb module. If you would like to use some special characters like (({|})) as a quote_char (passed as a parameter to CSV methods like read), program terminates with (({CSV::MalformedCSVError: Missing or stray quote in line xxx})) error message even if the input .csv file is correct.

Bellow is the assignment of the Regexp used for escaping special symbols used in regular expressions:

1587: @re_chars = /#{%"[-][\.$?*+{}()|# \r\n\t\f\v]".encode(@encoding)}/

The issue is with the leading (({[-]})) which I find completely wrong and causes miss of all matches it was intended to. The hyphen char "(({-}))" has to be escaped only inside brackets (({[]})) and only if it does not immediately follow the left bracket.

The quick fix for the above issue may look like

1587: @re_chars = /#{%"(?<!\[)-(?=.\])|[\.$?+{}()|# \r\n\t\f\v]".encode(@encoding)}/

I'd like to mention it would also match strings including right bracket without its left counterpart but it doesn't matter anyway. Lookbehind doesn't support quantifiers in Ruby so it would require to rewrite whole substitution code where applied.
=end

Associated revisions

Revision 45374
Added by James Gray about 1 month ago

  • lib/csv.rb: Fixed a broken regular expression that was causing CSV to miss escaping some special meaning characters when used in parsing. Reported by David Unric [Bug #8405]

Revision 45476
Added by Tomoyuki Chikanaga 24 days ago

merge revision(s) r45374: [Backport #8405]

* lib/csv.rb: Fixed a broken regular expression that was causing
  CSV to miss escaping some special meaning characters when used
  in parsing.
  Reported by David Unric
   [Bug #8405]

History

#1 Updated by Nobuyoshi Nakada 11 months ago

  • Description updated (diff)
  • Status changed from Open to Assigned
  • Assignee set to James Gray

#2 Updated by Yui NARUSE 11 months ago

  • Target version set to 2.1.0

#3 Updated by Hiroshi SHIBATA 3 months ago

  • Target version changed from 2.1.0 to current: 2.2.0

#4 Updated by James Gray about 1 month ago

  • % Done changed from 0 to 100
  • Status changed from Assigned to Closed

Applied in changeset r45374.


  • lib/csv.rb: Fixed a broken regular expression that was causing CSV to miss escaping some special meaning characters when used in parsing. Reported by David Unric [Bug #8405]

#5 Updated by Tomoyuki Chikanaga about 1 month ago

  • Backport changed from 1.9.3: UNKNOWN, 2.0.0: UNKNOWN to 1.9.3: REQUIRED, 2.0.0: REQUIRED, 2.1: REQUIRED

#6 Updated by Tomoyuki Chikanaga 24 days ago

  • Backport changed from 1.9.3: REQUIRED, 2.0.0: REQUIRED, 2.1: REQUIRED to 1.9.3: REQUIRED, 2.0.0: DONE, 2.1: REQUIRED

r45374 was backported to ruby20_0 at r45476.

Also available in: Atom PDF