I’m writing a Haskell program which generates an XML file. Apparently it is considered traditional to specify the character encoding in the <?xml?> tag. My question is, what’s the best thing to do?
-
Use
hGetEncodingto look up the file’s encoding, and record that in the XML file header. -
Use
hSetEncodingto specify which encoding I want, and then hard-code that into the XML file header.
The first option appears to have the problem that I’d need a way to translate what Haskell calls the encoding into what XML calls it. The second has the problem that unless I can figure out what encoding all the other applications on my PC use, the file will be unreadable (except to web browsers).
All of which is slightly baffling, because I almost certainly don’t even need Unicode anyway. I’m just writing plain ordinary English text with no special characters… (Ah, but the £ sign varies by encoding, doesn’t it? sigh)
I would recommend you use one of the already existing XML libraries on Hackage, such as xml-conduit, which will automatically handle encoding issues for you. In general, I strongly recommend outputting UTF-8 data.