I’m trying to achieve this:
I have a PDF byte[] in java web service that I must send as a base64 string to a .NET client that does this to reconstruct the file.
Encoding.Convert(Encoding.Unicode, Encoding.Default, Convert.FromBase64String(inputJava))
I cannot change the client code and right now the java web service is calling another .NET web service that does this to turn the byte[] into a base64 string:
System.Text.Encoding.Convert(System.Text.Encoding.GetEncoding(1252), System.Text.Encoding.Unicode, b);
Beside the base64 that I can make in various ways (e.g. with org.apache.commons.codec.binary.Base64), I have to turn the original byte[] into a UTF-16LE byte[]…
I tried this:
byte[] output = new byte[b.length * 2];
for(int i=0; i < b.length; i++)
{
int val = b[i];
if(val < 0) val += 256;
output[2*i + 0] = (byte) (val);
output[2*i + 1] = 0;
}
This works fine for values below 128 (e.g. for 1 => 0100, 2 => 0200, … , 127 => 7F00) but for values above (128 -> 255) I don’t know how to get the equivalent 2bytes values; I know that for byte 156 (9C) the corresponding value is 8301 (0x5301) and for byte 224 (E0) the corresponding value is 12501 (0x7D01) but I didn’t manage to find an algorithm to get all the other values.
Is there a mapping table between byte value and the corresponding UTF-16LE surrogate pair or an algorithm to map values from 128 to 255?
Thanks in advance!
I finally found a solution. It looks like that only bytes from 128 to 159 need the surrogate pairs. I use this piece of code to emulate .NET Unicode encoding: