I’ve written a rest resource that serves a .tar.gz file. It’s working OK. I’ve tried requesting it, saving the data, unpacking it (with tar xzvf [filename]) and I get the correct data.
However, I’m trying to use java.util.zip.GZIPInputStream and org.apache.tools.tar.TarInputStream to unzip and untar a .tar.gz that I’m serving in a JUnit test, to verify that it’s working automatically. This is the code in my unit test with some details removed:
HttpResponse response = <make request code here>
byte[] receivedBytes = FileHelper.copyInputStreamToByteArray(response.getEntity().getContent(), true);
GZIPInputStream gzipInputStream = new GZIPInputStream(new ByteArrayInputStream(receivedBytes));
TarInputStream tarInputStream = new TarInputStream(gzipInputStream);
TarEntry tarEntry = tarInputStream.getNextEntry();
ByteArrayOutputStream byteArrayOutputStream = null;
System.out.println("Record size: " + tarInputStream.getRecordSize());
while (tarEntry != null) // It only goes in here once
{
byteArrayOutputStream = new ByteArrayOutputStream();
tarInputStream.copyEntryContents(byteArrayOutputStream);
tarEntry = tarInputStream.getNextEntry();
}
byteArrayOutputStream.flush();
byteArrayOutputStream.close();
byte[] archivedBytes = byteArrayOutputStream.toByteArray();
byte[] actualBytes = <get actual bytes>
Assert.assertArrayEquals(actualBytes, archivedBytes);
The final assert fails with a difference at byte X = (n * 512) + 1, where n is the greatest natural number such that n * 512 <= l and l is the length of the data. That is, I get the the biggest possible multiple of 512 bytes of data correctly, but debugging the test I can see that all the remaining bytes are zero. So, if the total amount of data is 1000 bytes, the first 512 bytes in archivedBytes are correct, but the last 488 are all zero / unset, and if the total data is 262272 bytes I get the first 262144 (512*512) bytes correctly, but the remaining bytes are all zero again.
Also, the tarInputStream.getRecordSize() System out above prints Record size: 512, so I presume that this is somehow related. However, since the archive works if I download it, I guess the data must be there, and there’s just something I’m missing.
Stepping into the tarInputStream.copyEntryContents(byteArrayOutputStream) with the 1000 byte data, in
int numRead = read(buf, 0, buf.length);
the numRead is 100, but looking at the buffer, only the first 512 bytes are non-zero. Maybe I shouldn’t be using that method to get the data out of the TarInputStream?
If anyone knows how it’s supposed to work, I’d be very grateful for any advice or help.
It turned out that I was wrong in my original question, and the error was in the resource code. I wasn’t closing the entry on the TarOutputStream when writing to it. I guess this was not causing any problems when requesting it manually from the server, maybe because the entry was closed with the connection or something, but working differently when being requested from a Unit test… though I must admit that doesn’t make a whole lot of sense to be 😛
Looking at the fragment of my writing code below, I was missing line 3.
I didn’t even know there was such a thing as a “closeEntry” on the TarOutputStream… I do now! 😛