The JavaDoc says “The null byte ‘\u0000’ is encoded in 2-byte format rather than 1-byte, so that the encoded strings never have embedded nulls.”
But what does this even mean? What’s an embedded null in this context? I am trying to convert from a Java saved UTF-8 string to “real” UTF-8.
In C a string is terminated by the byte value 00.
The thing here is that you can have 0-chars in Java strings but to avoid confusion when passing the string over to C (which all native methods are written in) the character is encoded in another way, namely as two bytes
(according to the javadoc) neither of which is actually 00.
This is a hack to work around something you cannot change easily.
Also note, that this is valid UTF-8 and decode correctly to 00.