I have a need to search a pdf file to see if a certain string is present. The string in question is definitely encoded as text (ie. it is not an image or anything). I have tried just searching the file as though it was plain text, but this does not work.
Is it possible to do this? Are there any librarys out there for .net2.0 that will extract/decode all the text out of pdf file for me?
There are a few libraries available out there. Check out http://www.codeproject.com/KB/cs/PDFToText.aspx and http://itextsharp.sourceforge.net/
It takes a little bit of effort but it’s possible.