I exported tables and queries from SQL, where some of the fields are multi-line.
The Ruby (1.9+) way to read CSV appears to be:
require 'csv'
CSV.foreach("exported_mysql_table.csv", {:headers=>true}) do |row|
puts row
end
Which works great if my data is like this:
"id","name","email","potato"
1,"Bob","bob@bob.bob","omnomnom"
2,"Charlie","char@char.com","andcheese"
4,"Doug","diggyd@diglet.com","usemeltattack"
(The first line is the header/attributes)
But if I have:
"id","name","address","email","potato"
1,"Bob","---
- 101 Cottage row
- Lovely Village
- \"\"
","bob@bob.bob","omnomnom"
2,"Charlie","---
- 102 Flame Street
- \"\"
- \"\"
","char@char.com","andcheese"
4,"Doug","---
- 103 Dark Cave
- Next to some geo dude
- So many bats
","diggyd@diglet.com","usemeltattack"
Then I get the error:
.rbenv/versions/1.9.3-p194/lib/ruby/1.9.1/csv.rb:1894:in `block (2 levels) in shift': Missing or stray quote in line 2 (CSV::MalformedCSVError)
This seems to be because the end of the line doesn’t have a close quote, as it spans several lines.
(I tried ‘FasterCSV’, that gem became ‘csv’ since ruby 1.9)
Your problem is not the multiline but malformed CSV.
Replace the
\"and end space after a line end like this:This gives:
If you have no control over the program that delivers the CSV, you have to open the file, read the contents, do a replace and then parse the CSV. I use
__here but you can use other non-conflicting characters.