I stumbled across something that is not a problem, but something rather puzzling. I am copying a xml file myxml.xml to myxml_copy.xml and the file size of the output file is bigger. I don’t understand why this is so. Does this have anything to do with file encoding?
Anyway, the code I am using (although it is fairly trivial):
from xml.dom.minidom import parseString
import sys
def parseXml():
data = open(in_filename,'r').read()
return data
try:
in_filename = sys.argv[1]
out_filename = sys.argv[2]
out_file = open(out_filename,'w')
out_file.write(parseXml())
out_file.close()
except Exception,e:
print "usage: python copy.py <in_file> <out_file>"
print "Error",e
NOTE: I am not looking for a way to copy a file. I will be modifying the original xml file later (cutting and pasting different parts of it).
I think the problem is that the mode you open the file with needs to be
rband not justrandwbinstead ofw. (means – with binary mode)When it’s
rb– strings like\r\nwill stay this way, but when the mode isr– they will become\n.In short – just change the lines:
to
Did that help?