Using the javaocr framework from sourceforge. Trying to scan letters from a image, and training the system to recognize them.
Getting this exception when loading trainer:
java.io.IOException: Expected to decode 26 characters but actually decoded 33 characters in training: /Developer/MAckan/bin/LETTERS/trainLetters.PNG
at net.sourceforge.javaocr.ocrPlugins.mseOCR.TrainingImageLoader.load(TrainingImageLoader.java:111)
My code is like this:
loader.load(this,ClassLoader.getSystemResource("LETTERS/trainLetters.PNG").getPath(), new CharacterRange('A', 'Z'), images);
Another question is how to get it to train Scandinavian letters. If I enter a range A-Ö it expects 150 characters.
Then when I scan I try and scan a line in the image at the time:
scanner.addTrainingImages(images);
final CharacterRange[] cr = new CharacterRange[1];
cr[0] = new CharacterRange('A', 'Z');
// get the first line of letters
final int x1 = 0;
final int y1 = 130;
final int x2 = 640;
final int y2 = 170;
for (int i = 0; i < 15; i++) {
final String text = scanner.scan(boardImage, x1, y1 + (i * 40), x2,
y2 + (i * 40), cr);
System.out.println("scanned " + text);
}
And I actually get output, but not the output I expect…
Anyone have experience with the javaocr framework?
Update:
Solved the training issue. The training image was missing a couple of charachters and Scandinavian is not supported (?). Still getting strange output.
Update2:
Solved the entire issue with writing my own comparison instead. I did some manipulation of the images (reduced colors and transperency) and compared pixel by pixel and returned a diff against alafabet images. The lowest diff “wins”. Works for this particular case, but I am still interested in getting OCR running.
Thanks.
/A
Solved the entire issue with writing my own comparison instead. I did some manipulation of the images (reduced colors and transperency) and compared pixel by pixel and returned a diff against alafabet images. The lowest diff “wins”. Works for this particular case, but I am still interested in getting OCR running.
Thanks everyone for contributing.
/A