One of the important points of Java is that it'll…

Question

0

Asked: May 11, 20262026-05-11T01:50:04+00:00 2026-05-11T01:50:04+00:00

I need to invoke tesseract OCR (its an open source library in C++ that

0

I need to invoke tesseract OCR (its an open source library in C++ that does Optical Character Recognition) from a Java Application Server. Right now its easy enough to run the executable using Runtime.exec(). The basic logic would be

Save image that is currently held in memory to file (a .tif)
pass in the image file name to the tesseract command line program.
read in the output text file from Java using FileReader.

How much improvement in terms of performance am I likely to get by writing a JNI wrapper for Tesseract? Unfortunately there is not an open source JNI wrapper that works in Linux. I would have to do it myself and am wondering about whether the benefit is worth the development cost.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

score 0 · Answer 1 · 2026-05-11T01:50:05+00:00

It’s hard to say whether it would be worth it. If you assume that if done in-process via JNI, the OCR code can directly access the image data without having to write it to a file, then it would certainly eliminate any disk I/O constraints there.

I’d recommend going with the simpler approach and only undertaking the JNI option if performance is not acceptable. At least then you’ll be able to do some benchmarking and estimate the performance gains you might be able to realize.

How to approach applying for a job at a company ...

How to handle personal stress caused by utterly incompetent and ...

What is a programmer’s life like?

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions