I have an encryption algorithm (AES) that accepts a file converted to array byte and encrypt it.
Since I am going to process a very large files, the JVM may go out of memory.
I am planing to read the files in multiple byte arrays, each containing some part of the file. Then I iteratively feed the algorithm. Finally, I merge them to produce an encrypted file.
So my question is: Is there any way to read a file part by part to multiple byte arrays?
I thought I could use the following to read the file to a byte array:
IOUtils.toByteArray(InputStream input).
And then split the array into multiple bytes using:
Arrays.copyOfRange()
But I am afraid that the code that reads a file to ByteArray will make the JVM to go out of memory.
Look up cipher streams in Java. You can use them to encrypt/decrypt streams on the fly so you don’t have to store the whole thing in memory. All you have to do is copy the regular
FileInputStreamfor your source file to theCipherOutputStreamthat’s wrapping yourFileOutputStreamfor the encrypted sink file.IOUtilseven conveniently contains acopy(InputStream, OutputStream)method to do this copy for you.For example:
If you need to know the number of bytes that were copied, you can use
IOUtils.copyLargeinstead ofIOUtils.copyif the file sizes exceedInteger.MAX_VALUEbytes (2 GB).To decrypt the file, do the same thing, but use
CipherInputStreaminstead ofCipherOutputStreamand initialize yourCipherusingCipher.DECRYPT_MODE.Take a look here for more info on cipher streams in Java.
This will save you space because you won’t need to store
bytearrays of your own anymore. The only storedbyte[]in this system is the internalbyte[]of theCipher, which will get cleared each time enough input is entered and an encrypted block is returned byCipher.update, or onCipher.doFinalwhen theCipherOutputStreamis closed. However, you don’t have to worry about any of this since it’s all internal and everything is managed for you.Edit: note that this can result in certain encryption exceptions being ignored, particularly
BadPaddingExceptionandIllegalBlockSizeException. This behavior can be found in theCipherOutputStreamsource code. (Granted, this source is from the OpenJDK, but it probably does the same thing in the Sun JDK.) Also, from the CipherOutputStream javadocs:The bolded line here implies that the cryptographic exceptions are ignored, which they are. This may cause some unexpected behavior while trying to read an encrypted file, especially for block and/or padding encryption algorithms like AES. Make a mental note of this that you will get zero or partial output for the encrypted (or decrypted for
CipherInputStream) file.