I am finding the CSV parsing in Ruby 1.9.3 to be remarkably fragile. So much so that I am wondering if I am doing something wrong
If I do the following in irb I get an error:
1.9.3-p125 :011 > require 'csv'
=> true
1.9.3-p125 :012 > a = 'one,two,three, "four, five",six'
=> "one,two,three, \"four, five\",six"
1.9.3-p125 :013 > arr = CSV.parse(a)
CSV::MalformedCSVError: Illegal quoting in line 1.
from /Users/disaacs/.rvm/rubies/ruby-1.9.3-p125/lib/ruby/1.9.1/csv.rb:1925:in `block (2 levels) in shift'
from /Users/disaacs/.rvm/rubies/ruby-1.9.3-p125/lib/ruby/1.9.1/csv.rb:1887:in `each'
from /Users/disaacs/.rvm/rubies/ruby-1.9.3-p125/lib/ruby/1.9.1/csv.rb:1887:in `block in shift'
from /Users/disaacs/.rvm/rubies/ruby-1.9.3-p125/lib/ruby/1.9.1/csv.rb:1849:in `loop'
from /Users/disaacs/.rvm/rubies/ruby-1.9.3-p125/lib/ruby/1.9.1/csv.rb:1849:in `shift'
from /Users/disaacs/.rvm/rubies/ruby-1.9.3-p125/lib/ruby/1.9.1/csv.rb:1791:in `each'
from /Users/disaacs/.rvm/rubies/ruby-1.9.3-p125/lib/ruby/1.9.1/csv.rb:1805:in `to_a'
from /Users/disaacs/.rvm/rubies/ruby-1.9.3-p125/lib/ruby/1.9.1/csv.rb:1805:in `read'
from /Users/disaacs/.rvm/rubies/ruby-1.9.3-p125/lib/ruby/1.9.1/csv.rb:1379:in `parse'
from (irb):13
from /Users/disaacs/.rvm/rubies/ruby-1.9.3-p125/bin/irb:16:in `<main>'
I’ve found that the problem is the extra space preceding the “four, five” value. If I remove the space, then it works.
1.9.3-p125 :010 > a = 'one,two,three,"four, five",six'
=> "one,two,three,\"four, five\",six"
1.9.3-p125 :011 > arr = CSV.parse(a)
=> [["one", "two", "three", "four, five", "six"]]
Spaces in front of the other values does not cause a problem. The following parses just fine
one, two, three,"four, five", six
Is there some parse option I am missing that makes using quoted values so fragile?
This is correct behavior. It’s not being fragile.
Your comma after “four” is ending the field, and the next field starts immediately with the space.
You can’t validly put a quote in the middle of a field (without escaping it).