I am getting some unexpected results from what I thought was a simple test. After running the following:
byte [] bytes = {(byte)0x40, (byte)0xE2, (byte)0x56, (byte)0xFF, (byte)0xAD, (byte)0xDC};
String s = new String(bytes, Charset.forName("UTF-8"));
byte[] bytes2 = s.getBytes(Charset.forName("UTF-8"));
bytes2 is a 14 element long array nothing like the original (bytes). Is there a way to do this sort of conversion and retain the original decomposition to bytes?
Well that doesn’t look like valid UTF-8 to me, so I’m not surprised it didn’t round-trip.
If you want to convert arbitrary binary data to text in a reversible way, use base64, e.g. via this public domain encoder/decoder.