I’m very new to Python and just crawling my way through it to accomplish a task and would appreciate some help (Python 3.1).
I have a CSV file written with DictWriter with a dialect of “excel”. After the file is created, I’m notice extra lines in the file, and upon closer inspection it’s because I have “\r\r\n” at the end of each line instead of “\r\n”.
I could solve this one of 2 ways:
-
Open the file in binary mode instead of text. Problem with this is that I cannot for the life of me figure out how to get writerow() to work against a binary file — I get a ton of exceptions.
-
Second (easier) solution is just replacing all the “\r\r\n” with “\r\n”.
However, on my attempts, I ran into these errors:
a. Not closing the file first, and the search and replace just adds even more “\r\r\n” lines.
b. I’ve tried closing the file first, to re-open in binary mode and doing the same search and replace but I”m getting and error:
WindowsError: [Error 32] The process cannot access the file because it is being used by another process
Here is the code:
#code before this writes to the final in text mode
myfile.close()
myfile = open(outputFile, "wb")
for line in fileinput.FileInput(outputFile, inplace=1):
line = line.replace("\r\r\n", "\r\n")
print (line)
myfile.close()
Would appreicate any help anyone can provide!
The safe way to alter a file (with the exception of appending, which can be safely done in-place) is to copy it with modification to a new file, remove the old one, rename the new like the old. This is the one solid way to avoid catastrophic errors and data loss. Depending on the platform, the step to “remove old, rename new” can be atomic, but that’s hard in Windows and not all that crucial.
So I’d simply do that — in one big gulp, unless the file is horribly huge (gigabyte-plus):
The problems with your code are of confusion between binary and text mode — you can’t properly “read a line” from a binary-mode opened file, for example.
Edit in Python 3.1 we need to deal with
bytesinstances here, not strings, since the files are binary ones. So, per the docs, thewritecalls must becomethose
bprefixes tell Python we’re dealing withbytes, notstrings.