I have a function that converts a .csv file to a datatable. One of the columns I am converting is is a field of names that have a comma in them i.e. “Doe, John” when converting the function treats this as 2 seperate fields because of the comma. I need the datatable to hold this as one field Doe, John in the datatable.
Function CSV2DataTable(ByVal filename As String, ByVal sepChar As String) As DataTable
Dim reader As System.IO.StreamReader
Dim table As New DataTable
Dim colAdded As Boolean = False
Try
''# open a reader for the input file, and read line by line
reader = New System.IO.StreamReader(filename)
Do While reader.Peek() >= 0
''# read a line and split it into tokens, divided by the specified
''# separators
Dim tokens As String() = System.Text.RegularExpressions.Regex.Split _
(reader.ReadLine(), sepChar)
''# add the columns if this is the first line
If Not colAdded Then
For Each token As String In tokens
table.Columns.Add(token)
Next
colAdded = True
Else
''# create a new empty row
Dim row As DataRow = table.NewRow()
''# fill the new row with the token extracted from the current
''# line
For i As Integer = 0 To table.Columns.Count - 1
row(i) = tokens(i)
Next
''# add the row to the DataTable
table.Rows.Add(row)
End If
Loop
Return table
Finally
If Not reader Is Nothing Then reader.Close()
End Try
End Function
Don’t use a
.Split()function to read your csv data. Not only does it cause the kind of error you just ran into but it’s slower as well. You need a state machine -based parser. That will be faster and make it easier to correctly handle quote-enclosed text.I have an example here:
Reading CSV files in C#
and there’s also a highly-respected CSV reader on codeplex you can use:
http://www.codeproject.com/KB/database/CsvReader.aspx
You’d use my code like this:
If you’re using .net 3.5 or later, I’d write it a little differently to pull the column creation out of the for each loop (using type inference and
.Take(1)), but I wanted to be sure this would work with .Net 2.0 as well.