I have run into such a java String where the following is false:
body.equals(new String(body.getBytes()));
I suppose this is because the String constructor is by default treating the encoding of the body byte[] as UTF-8, I’m not 100% sure. How would I be able to store this string in a byte[] and be able to convert it back later? I suppose I need to be able to determine what encoding the byte[] is in. How would I do this?
Some context: I need the byte[] so I can compress the data, store it in a db, and later uncompress and turn the uncompressed byte[] back into the original string. The string originally comes from some library which downloaded a webpage, and i’m not sure what processing they do on the string before handing it to me.
Just make sure that you use the same charset both ways – when creating the byte array from the String and when creating the String from the byte array.
So you example would be better as:
This will guarantee, no matter what the environment, that the bytes will be understood.
You should also, almost unquestionably, be using unicode. If you choose a single-byte encoding (e.g. an ISO code-page) you will likely regret it in future, even if there is a single-byte encoding that satisfies your needs right now.