I want to extract some specified text in pdf files and the text position.
I know xpdf and mupdf can parse pdf files,so i think they may help me to fulfill this task.
But how to use these two lib to get text position?
Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.
Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.
Lost your password? Please enter your email address. You will receive a link and will create a new password via email.
Please briefly explain why you feel this question should be reported.
Please briefly explain why you feel this answer should be reported.
Please briefly explain why you feel this user should be reported.
Mupdf comes with a couple of tools, one being
pdfdraw.If you use pdfdraw with the
-ttoption, it will generate anXMLcontaining all characters and their exact positioning information.From there you should be able to find what you need.