I have a Java servlet that receives data from an upstream system via a HTTP GET request. This request includes a parameter named “text” and another named “charset” that indicates how the text parameter was encoded:
If I instruct the upstream system to send me the text TĀ and debug the servlet request params, I see the following:
request.getParameter("charset") == "UTF-16LE"
request.getParameter("text").getBytes() == [0, 84, 1, 0]
The code points (in hex) for the two characters in this string are:
[T] 0054
[Ā] 0100
I cannot figure out how to convert this byte[] back to the String "TĀ". I should mention that I don’t entirely trust the charset and suspect it may be using UTF-16BE.
Use the
String(byteArray, charset)constructor: