I know the OCR question with Python has already been discussed many times. However

Question

0

Asked: June 9, 20262026-06-09T06:03:06+00:00 2026-06-09T06:03:06+00:00

I know the OCR question with Python has already been discussed many times. However

0

I know the OCR question with Python has already been discussed many times.
However I didn’t find anything that seems to help me excpt this question
Python Tesseract OCR question.
But it didn’t solve my problem.

I need to make a little script to capture the text inside an opened window (of a text editor).

So it should:

Take a screenshot
Find the position of the text editor window and slice the screenshot (dunno if this passage is needed)
Convert it to grayscale and pass it to tesseract

I’m kinda newbie to Python and I dunno if this is feasible.

However thanks in advance for any hint.

Giorgio

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-09T06:03:07+00:00

This is certainly possible but also generally, unreasonable. There are better ways. Say you are parsing a webpage, you could either grab the HTML text without running it through an OCR or if you want to read the text of an image, you can parse through the HTML with urllib2, select the image and just download the image directly to a file. There are many HTML parser alternatives in Python that you can use, as well. Greyscale is simple with PIL or ImageMagick. From there, you can run it through an OCR or do it within the script with a Python wrapper like python-tesseract.

Alternatively—if you insist on doing a screenshot, something like this would be useful for you. I still hold that there are almost always better ways, but this should get you started if you want to try it out.

import gtk.gdk

w = gtk.gdk.get_default_root_window()
sz = w.get_size()
print "The size of the window is %d x %d" % sz
pb = gtk.gdk.Pixbuf(gtk.gdk.COLORSPACE_RGB,False,8,sz[0],sz[1])
pb = pb.get_from_drawable(w,w.get_colormap(),0,0,0,0,sz[0],sz[1])
if (pb != None):
    pb.save("screenshot.png","png")
    print "Screenshot saved to screenshot.png."
else:
    print "Unable to get the screenshot."

This was taken from Take a screenshot via a python script. [Linux]

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I know the OCR question with Python has already been discussed many times. However

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply