I see that some Information like The Unicode Book and some Wikipedia Article tell us that Unicode is the default Character Set of HTML & XML.
I understand the words “Character Set” like the “repertorie” that you can use to work with when you are making a file. Which leads to some editors set his own default character sets regardless what kind of file is going to be worked. No matter if you are trying to make an HTML file, some editors don’t set Unicode as default.
Which leaves the question that if Unicode is the default Character set of HTML and XML or depends of the editor used to create the file…
I suppose that you could call Unicode “the default” because both HTML and XML define their allowed content in terms of Unicode.
However, a file can’t be “in Unicode,” it has to be in some encoding of Unicode. By default, XML files are required to be in either UTF-8 or UTF-16 encoding, unless the prologue specifies differently. The HTML spec explicitly leaves the supported encodings undefined, and indicates that the encoding is handled by the transport protocol (eg, HTTP).