I know the OCR question with Python has already been discussed many times.
However I didn’t find anything that seems to help me excpt this question
Python Tesseract OCR question.
But it didn’t solve my problem.
I need to make a little script to capture the text inside an opened window (of a text editor).
So it should:
- Take a screenshot
- Find the position of the text editor window and slice the screenshot (dunno if this passage is needed)
- Convert it to grayscale and pass it to tesseract
I’m kinda newbie to Python and I dunno if this is feasible.
However thanks in advance for any hint.
Giorgio
This is certainly possible but also generally, unreasonable. There are better ways. Say you are parsing a webpage, you could either grab the HTML text without running it through an OCR or if you want to read the text of an image, you can parse through the HTML with urllib2, select the image and just download the image directly to a file. There are many HTML parser alternatives in Python that you can use, as well. Greyscale is simple with PIL or ImageMagick. From there, you can run it through an OCR or do it within the script with a Python wrapper like python-tesseract.
Alternatively—if you insist on doing a screenshot, something like this would be useful for you. I still hold that there are almost always better ways, but this should get you started if you want to try it out.
This was taken from Take a screenshot via a python script. [Linux]