I recognize there is a similar question on here, from a fellow who wanted to break a single file into multiple files. Regrettably, though, since there is a certain amount of overhead associated with creating a new file, this solution won’t work for me.
Background (not necessary to read):
What I’m trying to do is generate PDF files of arbitrary size to seed a database (i.e. calling a method with the size of the file desired, in kilobytes or megabytes, should generate a file of the desired size).
Currently, I’m ensuring the input data is incompressible by making it random and putting it in 1KB blocks (in paragraph form) into the file. After plotting the number of output bytes as a function of the number of desired bytes, I’ve altered the algorithm to account for this (pleasantly and expectedly) linear relationship.
However, due to the stochastic nature of the input data, there is a certain amount of uncertainty associated with this method, the absolute value of which increases with desired size (so, despite the fact that it’s hundredths of a percent off, those hundredths of a percent become pretty significant in absolute terms in a 20 MB file).
Optimally, I would be able to generate files of arbitrary size to within the kilobyte, but in order to do this I would need to know the file size after any given operation, and in order to know that, I’d need to know when PDFWriter writes its buffer. or at least how large that buffer is (i.e. if the buffer is less than one kilobyte, when it writes doesn’t matter because I only care about being accurate to within that margin).
The Question:
Is there a way to check the number of bytes of data actually to be written to disk in a PDF using text without actually closing the document?
Or does ‘closing the document’ just mean it flushes the buffer and closes the stream (i.e. it doesn’t need to write any additional non-user-input quantity of data to the file when it closes)?
When you build your PdfWriter, you must specify an OutputStream, that is not necessarily a FileOutputStream. So if you build it in this way
you can check the buffer size in any moment:
Hope this will help you.