I’m running into an issue with extra newlines on windows versus *nix platforms.
file = open('UTF16file.xml', 'rb')
html = file.read().decode('utf-16')
file.close()
regexp = re.compile(self.originalurl, re.S)
(html, changes) = regexp.subn(self.newurl, html)
file = open('UTF16file-regexed.xml', 'w+')
file.write(html.encode('utf-16'))
file.close()
Running this code on my mac works – I get my file back without the extra line breaks. So far I’ve tried:
-
Encoding the regular expression as utf-16 instead of decoding the file – breaks on Windows and OSX.
-
Writing in mode ‘wb’ instead of ‘w+’ – breaks on Windows.
Any ideas?
Looks like:
(though when I copy-pasted it from Notepad to FF it actually put in line breaks)…but this:
Looks like:
(on Windows XP SP3 32-bit)