If I get an element that has a <BR \> inside, and get it’s text with the innerText property, I’m seeing that the line break is two characters: 13 and 10. What determines this? Is it the browser or the web page’s encoding?
I want to either make sure line breaks are always going to be this two characters (as long as it’s part of the static content of the web page and not dynamically created content) or modify my text processing algorithm to handle both possibilities.
This is something I’ll be using to split text into lines with the split method. I’m not sure if I should use split("\r\n") or some more complicated code.
It depends on your editor and/or OS. Windows uses \r(13)\n(10). Unix systems use only \n. Old macs used \r.
You could just replace all \r\n by \n and than split on \n. So