I have a big number pdf documents with xml files attached to them. I would like to extract those attached xml files and read them. How can I do this programatically using .net?
Share
Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.
Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.
Lost your password? Please enter your email address. You will receive a link and will create a new password via email.
Please briefly explain why you feel this question should be reported.
Please briefly explain why you feel this answer should be reported.
Please briefly explain why you feel this user should be reported.
iTextSharp is also quite capable of extracting attachments… Though you might have to use the low level objects to do so.
There are two ways to embed files in a PDF:
Once you have a file specification dictionary from either source, the file itself will be a stream within the dictionary labeled “EF” (embedded file).
So to list all the files at the document level, one would write code (in Java) as such: