I have many large csv files (1-10 gb each) which I’m importing into databases. For each file, I need to replace the 1st line so I can format the headers to be the column names. My current solution is:
using (var reader = new StreamReader(file))
{
using (var writer = new StreamWriter(fixed))
{
var line = reader.ReadLine();
var fixedLine = parseHeaders(line);
writer.WriteLine(fixedLine);
while ((line = reader.ReadLine()) != null)
writer.WriteLine(line);
}
}
What is a quicker way to only replace line 1 without iterating through every other line of these huge files?
If you can guarantee that
fixedLineis the same length (or less) asline, you can update the files in-place instead of copying them.If not, you can possibly get a little performance improvement by accessing the
.BaseStreamof yourStreamReaderandStreamWriterand doing big block copies (using, say, a 32K byte buffer) to do the copying, which will at least eliminate the time spent checking every character to see if it’s an end-of-line character as happens now withreader.ReadLine().