I’ve been reading the book TCP/IP Sockets in Java, 2nd Edition. I was hoping to get more clarity on something, but since the book’s website doesn’t having a forum or anything, I thought I’d ask here.
In several places, the book uses a byte mask to avoid sign extension. Here’s an example:
private final static int BYTEMASK = 0xFF; //8 bits
public static long decodeIntBigEndian(byte[] val, int offset, int size) {
long rtn = 0;
for(int i = 0; i < size; i++) {
rtn = (rtn << Byte.SIZE) | ((long) val[offset + i] & BYTEMASK);
}
return rtn;
}
So here’s my guess of what’s going on. Let me know if I’m right.
BYTEMASK in binary should look like 00000000 00000000 00000000 11111111.
To make things easy, let’s just say the val byte array only contains 1 short so the offset is 0. So let’s set the byte array to val[0] = 11111111, val[1] = 00001111. At i = 0, rtn is all 0’s so rtn << Byte.SIZE just keeps the value the same. Then there’s (long)val[0] making it 8 bytes with all 1’s due to sign extension. But when you use & BYTEMASK, it sets all those extra 1’s to 0’s, leaving that last byte all 1’s. Then you get rtn | val[0] which basically flips on any 1’s in the last byte of rtn. For i = 1, (rtn << Byte.SIZE) pushes the least-significant byte over and leaves all 0’s in place. Then (long)val[1] makes a long with all zero’s plus 00001111 for the least-significant byte which is what we want. So using & BYTEMASK doesn’t change it. Then when rtn | val[1] is used, it flips rtn‘s least-significant byte to all 1’s. The final return value is now rtn = 00000000 00000000 00000000 00000000 00000000 00000000 11111111 11111111.
So, I hope this wasn’t too long, and it was understandable. I just want to know if the way I’m thinking about this is correct, and not just completely wacked out logic. Also, one thing that confuses me is the BYTEMASK is 0xFF. In binary, this would be 11111111 11111111, so if it’s being implicitly cast to an int, wouldn’t it actually be 11111111 11111111 11111111 11111111 due to sign-extension? If that’s the case, then it doesn’t make sense to me how BYTEMASK would even work. Thank you for reading.
Everything is right except for the last point:
0xFFis already anint(0x000000FF), so it won’t be sign-extended. In general, integer number literals in Java areints unless they end with anLorland then they arelongs.