I’m outputting a byte array to a text file using the following method:
try{
FileOutputStream fos = new FileOutputStream(filePath+".8102");
fos.write(concatenatedIVCipherMAC);
fos.close();
}catch(Exception e)
{
e.printStackTrace();
}
which outputs to the file a UTF-16 encoded data, example:
¢¬6î)ªÈP~m˜LïiƟê•Àe»/#Ó ö¹¥‘þ²XhÃ&¼lG:Öé )GU3«´DÃ{+í—Ã]íò
However when I’m reading it back in I get þÿ prepended to the front of the data, e.g:
þÿ¢¬6î)ªÈP~m˜LïiƟê•Àe»/?#Ó ö¹¥‘þ²XhÃ&¼lG:Öé )GU3«´DÃ{+í—Ã]íò
This is the method I’m using to read in the file:
private String getFilesContents()
{
String fileContents = "";
Scanner sc = null;
try {
sc = new Scanner(file, "UTF-16");
System.out.println("Can read file: "+file.canRead());
} catch (FileNotFoundException e) {
e.printStackTrace();
}
while(sc.hasNextLine()){
fileContents += sc.nextLine();
}
sc.close();
return fileContents;
}
and then byte[] contentsOfFile = fileContents.getBytes("UTF-16"); to convert the String into a byte array.
A quick Google told me that þÿ represents the byte order but is it Java putting that there or Windows? How can I avoid having the þÿ prepended at the start of the data I’m reading in? I was thinking of just ignoring the first two bytes but if it is Windows then this will obviously break the program on other platforms.
edit: changed appended to prepended.
Yes. You shouldn’t be trying to treat it as text anywhere.
If you really need to convert arbitrary binary data into text, use Base64 to convert it. Other than that, stick to byte arrays,
InputStreamandOutputStream.I don’t know exactly why you’re supposedly getting extra characters, but the fact that you haven’t got real text to start suggests that it’s not really worth diagnosing that side. Just start handling binary data as binary data instead.
EDIT: Have a look at Guava‘s IO helpers for simplicity…