How can I programmatically detect what record/row separator is used in a CSV file. In 90% of the cases it is CR/LF pair, but sometimes it is either CR or LF.
It should also take into account that line feeds in quoted data fields can differ from the row/record separators.
How can I do that?
Update: I am only interested what is the row/record separator: CR/LF, CR, or LF.
When you open a file with iostreams in text mode, the library will take care about different line endings on either Linux, Windows or MacOS.
Line feeds in a quoted data field could be detected by counting unescaped quotes. If the number is odd, you might have an unterminated data field and hence a newline embedded.
When you want to know, which line separator is used, just read character wise until you get either CR or LF. If it’s LF, you’re done; if it’s CR read the next character. If the next character is LF, your line ending is CR LF, otherwise it’s just CR.