I have a set of image files that I can identify. Rather than an OCR, I’d like to search only for matches within the set. What’s the ideal platform to quickly find matches?
Share
Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.
Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.
Lost your password? Please enter your email address. You will receive a link and will create a new password via email.
Please briefly explain why you feel this question should be reported.
Please briefly explain why you feel this answer should be reported.
Please briefly explain why you feel this user should be reported.
OpenCV is an advanced computer vision library. It can recognize text blocks, colors, shapes, etc. so it might be of use.
Tesseract can be trained to handle languages, but I can’t see a reason why you couldn’t train it with shapes. Here’s a really confusing training guide.
ImageMagick can also be useful. It’s pretty hardcore endless parameter chaining, but you can get it to find images. It’s not perfect for this application, but it’s been done before. The documentation is insanely huge, but it’s about as complete and illustrated as I could wish for (I’m a frequent user, as it’s useful for quick image operations via CLI). Here’s the image comparison documentation.
I would suggest OpenCV, but it’s up to you. Good luck!