I am looking to take a PDF and extract any text from it. I

Question

0

Asked: May 11, 20262026-05-11T02:37:16+00:00 2026-05-11T02:37:16+00:00

I am looking to take a PDF and extract any text from it. I

0

I am looking to take a PDF and extract any text from it. I then want to make it available using ColdFusion’s available Verity search to search the contents.

Are there any libraries out there that do this quite well already? I am including Java or .NET (Java prefered) libraries in the scope since they can be called from CF.

Any insights or experiences would be greatly appreciated… thanks!

Edit: Indexing PDF files works when the text is embedded in the PDF as far as I know with CF. The PDFs I’m having to deal with have the text scanned as an image.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

score 0 · Answer 1 · 2026-05-11T02:37:17+00:00

2026-05-11T02:37:17+00:00Added an answer on May 11, 2026 at 2:37 am

If you have the ability to run your own software (i.e. Dedicated/VPS) then you could investigate using Tesseract OCR with cfexecute to convert the PDFs to text?

0

Reply
Share
Share

- Report

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I am looking to take a PDF and extract any text from it. I

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply