When running svndumpfilter2 on Windows, I get a problem that seems to take its origin in the fact that the dump file has sometimes CRLF endings.
Some files in the SVN database had CRLF line endings. But it seems that Python counts CRLF as one character (not counting the CR character as separate from the following LF in the content of the files). Thus, it fails read the right amount of characters, and misses the start of the next lump.
So my question is: how to tell Python to treat CRLF as two separate characters?
The stream is read from sys.stdin so I’m looking for a way to change the newline property of stdin. What is the “one right way” to do that in Python?
Update: One way that occurs to me is to explicitly set the mode of
stdinto binary. So something like the following will read CRLF as two characters:Another way is to start python with the
-uflag which results in a unbufferedstdin(as well asstdoutandstderr). So justpython -u myscript.pywhere myscript.py callsstdin.read(1)with no other changes. Seepython --helpfor more information on this.Old: If you’re on windows, Python should be able to handle this without any intervention when you call
sys.stdin.readline(or simply iterate oversys.stdinwhich is a file like object). Are you usingsys.stdin.readinstead? If so, you need to handle that case yourself.