I have an array “result” that contains values from 0-255. I originally declared it as byte[], but when I have to write 128, result[i] gets value -128 and in the output file it is written “€” that is finally read as 8364.
As I can see that byte only accepts values -128 to 127, what data type should I use for values from 0-255 (without wasting memory)?
Should I change as well Content Type or add any charset header?
Thanks
res.setContentType("application/octet-stream");
res.setHeader("Content-Disposition","attachment;filename=output.js");
ServletOutputStream os = res.getOutputStream();
byte[] result=encode(req.getParameter("originalScript")); // Result[i]=-128 (should be 128)
os.write(result,0,result.length); // result[i] on output.js is written as "€" (8364)
You’re confused by mixing several concepts.
First of all, the int 128 is the same as the byte -128 (int 255 == byte -1, 254 == -2, … 128 = -128). Bytes are signed and the sign information is in the highest bit. Your mistake here is that you didn’t use the correct way to convert the byte value back to an int. To fix this, use this code:
gives
-128and128.Next: ASCII is only defined for values between 0 and 127. This means anything > 127 is garbage unless you treat it carefully.
The problem is when you read the output of your code. Since ASCII can’t contain values > 127, what should the reading code do?
“output.js” sounds like you’re using a web browser to read this data as a JavaScript file. The web browser will try to convert the byte stream into text using an “encoding”. You don’t specify one, the browser has to make a guess and gets it wrong (and
application/octet-streamseems wrong, too. Shouldn’t that betext/javascript?).You have two options:
Change
encode()to return a properly encoded UTF-8 string (UTF-8 is a way to send unicode as bytes) and set the charset toUTF-8(which is usually the default but better be safe than sorry):Set the charset to
ISO-8859-1which will preserve the bytes 1:1. This will fail if your script contains any Unicode characters > 255. Since there won’t be an error, you should not use this approach. I just mention it for completeness.