Further to this question I’ve got a supplementary problem.
I’ve found a track with an “É” in the title.
My code:
var playList = new StreamWriter(playlist, false, Encoding.UTF8);
–
private static void WriteUTF8(StreamWriter playList, string output)
{
byte[] byteArray = Encoding.UTF8.GetBytes(output);
foreach (byte b in byteArray)
{
playList.Write(Convert.ToChar(b));
}
}
converts this to the following bytes:
195
137
which is being output as à followed by a square (which is an character that can’t be printed in the current font).
I’ve exported the same file to a playlist in Media Monkey at it writes the “É” as “É” – which I’m assuming is correct (as KennyTM pointed out).
My question is, how do I get the “‰” symbol output? Do I need to select a different font and if so which one?
UPDATE
People seem to be missing the point.
I can get the “É” written to the file using
playList.WriteLine("É");
that’s not the problem.
The problem is that Media Monkey requires the file to be in the following format:
#EXTINFUTF8:140,Yann Tiersen - Comptine D'Un Autre Été: L'Après Midi
#EXTINF:140,Yann Tiersen - Comptine D'Un Autre Été: L'Après Midi
#UTF8:04-Comptine D'Un Autre Été- L'Après Midi.mp3
04-Comptine D'Un Autre Été- L'Après Midi.mp3
Where all the “high-ascii” (for want of a better term) are written out as a pair of characters.
UPDATE 2
I should be getting c9 replaced by c3 89.
I was going to put what I’m actually getting, but in doing the tests for this I’ve managed to get a test program to output the text in the right format “as is”. So I need to do some more investigation.
StreamWriteralready converts the characters you send it to UTF-8 — that’s its entire purpose. ThrowWriteUTF8away; it’s broken and useless.(
WriteUTF8is taking characters, converting them to UTF-8 bytes, converting each single byte to the character it maps to in the current code page, then encoding each of those characters in UTF-8. So in the best case you have a doubly-UTF-8-encoded string; in the worst, you’ve completely lost bytes that weren’t mapped in the system code page repertoire; especially bad for DBCS code pages.)The problem you’re having with Media Monkey may be just that it doesn’t support UTF-8 or Unicode filenames at all. Try asking it to play (and export a playlist for) files with characters that don’t fit in your system codepage, for example by renaming a file to
αβγ.mp3.Edit:
OK, what you’ve got there is a mixture of encodings in the same file: it’s no wonder text editors are going to have trouble opening it. The uncommented and
#EXTINFlines are in the system default code page, and are present to support media players that can’t read Unicode filenames. Any filename characters not present in the system code page (eg. Greek as above, on a Western Windows install) will be mangled and unplayable for anything that doesn’t know about the#UTF8(and#EXTINFUTF8for the description) lines.So if this is your target format, you’ll need to grab two encodings and use each in turn, something like: