I am working on a project that currently uses a .tiff, compares the defined template document to the document in question. We are moving away from the .tiff format for a variety of reasons but mainly because the new files will be coming in the format of PDF.
I see two potential solutions to the issue. First convert the PDF to a tiff and use the existing code.
Or second, use a PDF library that will compare the template PDF to the PDF that is received.
Because the PDF that is received will basically come from an outside source we won’t know for sure if it is text based or image based so the library or tool will have to be able to compare both.
Any suggestions on tools/libraries you have found helpful would be great!
Thank you in advance!
dj
What we ended up doing was using the
Aspose.Pdflibrary.I ended up learning there are two types of PDFs:
I did not have any issues comparing the Text based PDFs. However, at the point that a image based PDF was received converting the PDF to a
.tiffso that we could use Microsoft’s MODI to compare the PDF against our specified template. The.tiffwould be a blank image rather than the actual content of the PDF. Aspose.Pdf library did cost some money, however in the end, the library did exactly what we needed it to and it allowed us to meet our client’s needs.