Instead, I decided to take a blank sheet and do normalized cross correlation with it against a filled in box. This way, I could find the center of the filled in boxes. I basically figured out how big the boxes are, and can now extract the letter mo' betta. The red circle is where the normalized cross correlation signal was the strongest.
Zample:

To extract the letters, I found that the thickness of the character box lines were about 3 pixels wide so I took this into account.
The result:

Yee-haw. I took the output of this and fed it into my "get rid of white space" function, and resized the letters to 24 by 24 pixels. I also checked if the sum of the pixels of the character image are above a certain threshold, and if so, I assumed there was no letter there. I display these as gray cells with an x through them. This resulted in the following:

Here is another zample.

There is an issue though. SOME people (I'm not going to name any names) cannot write inside character boxes. Here is an example:

Because of that last 'a', my algorithm thinks there's a legitimate character there. See:

However, as far as I can tell, if someone can't keep their characters inside the boxes, they don't deserve to get their quiz graded! They can play games... but so can we! They want to play games, so be it! Let the games begin!
2 comments:
yo, babs linked me to your blog. your writing style is stuntastic. keep up the good work.
shieeeeeet
Post a Comment