Monday, February 25, 2008

Experimenting with nearest neighbor

Nearest neighbor shockingly doesn't work that well. I think part of the problem is that the training data isn't all that great and there isn't enough of it.

Here's the setup:

Training:
I have 90 training images for each letter in the alphabet (there are 26 of those). This makes for 2340 total training images, for those of you that can't do maf. All of the training images are the same size - 8 bit images of 120 by 123 pixels.

Testing:
So far I only have 2 names that I'm testing against but that size will grow shortly. I made the test characters also be 120 by 123 pixels (by adding extra white space evenly around the edges).

Here is the testing process:
I load all of my training data into a huge matrix of size 2340 x 14760, where each row is a strung out training image (of a character). I then read in a test character image, and find the euclidean distance between that test image and each of the training images, and sort the results based on the distances.

Currently I am looking at the top 50 closest matches and having those vote on a character. I have been getting some good and some bad results.

The first letter I tried, 'J' from "Jean Poole" had 'J' as its top match!
At that point life was pretty good. For one thing, all of the top 10 matches were j's. So that case works pretty well.

The next letter I tried was 'e':

The top match for that was 'p'... not so good. In fact, only 3 of the 50 votes were for 'e'. Here are the training e's... so you're trying to tell me only 3 of these look like that 'e' up there?

Here is another example.. 'o':

Luckily the mode of the top 50 matches is 'o', so 'o' wins but still, some weird results. The training image that is closest to the 'o' is the following 'n':...

I don't really get why that is. Here is the second closest match:

This is a little more reasonable, even though it is a 'c'. You can see the resemblance.

Lastly, here is the third match:

This happens to be a 'g' that was cut-off at the bottom during the traumatizing training process. This is also fairly understandable. Finally, the 4th closest match is an 'o'. (Of course after that there are plenty more random results.) This is what the first matching 'o' looks like:

Wednesday, February 13, 2008

First try at nearest neighbor - very simple!

I took one of my a's from "Jean Poole" and ran nearest neighbor with it against my training a's and b's only, and the nearest neighbor was an 'a'! Phew. At least that works. As a matter of fact, the top 10 nearest neighbors were a's, and then there was a 'b' and so on. And now for harder tests....

Test images all the same size

These are some examples of test images after I've made them all the same size. Now I have to do the same for training images and make sure the test images AND training images can be the same size!

Just to give you an idea, the size of the characters are about 76 by 40 pixels.

The 'h' here is bad, but I can't fix this without changing the code a lot and in the process making other things break:


This one looks good to me:

Example of test data

Here is what the test data looks like once I've isolated the character boxes and done a little thresholding:


Here is what the test data looks like after I've removed the unwanted lines towards the edges:

Monday, February 11, 2008

Training data all set

Partial set of training data for character 'a':
All of these individual characters are saved to their own .png file. The next step is to make them all the same size by adding extra white pixels to the boundaries, based on the size of the largest training image. Then, I will perform the nearest neighbor algorithm on the test data. The goal is for this to be done by Wednesday.

Wednesday, February 6, 2008

Looking at thresholded characters

Just for kicks, this is what the training images look like as binary images. The top row is the 8-bit image and the bottom row is the binary image.



And b's, as always:


Here all I have to store are the non-zero indices, as opposed to storing the whole image, which is beneficial. And now I should center these characters! (Subtract off the centroid... why is that so hard?!)

Monday, February 4, 2008

Using results of edge.m

Since there are artifacts near the edges on the letters both in the training and test data, perhaps I will use the results of edge.m to run nearest neighbor on. Here are examples of outputs of edge.m (the figures with black in the background).

Training data 'a':


Training data 'b':


Test data with edge.m run on the individual characters (after removing the leftover lines). Still needs some work.


And these were the original characters.


Another test data image:

I'm better at getting rid of leftover lines!

Yes, I am. Now the next step is to save all training characters in one place. Here are some examples of the training data (because you really need more).

Top line is after lines were removed, bottom is before.

Example 1:


Example 2:


Example 3:


Example 4:

Getting rid of leftover lines

To get rid of the leftover lines, I expand a rectangle starting in the middle of the character image and stop it when the sum of each rectangle edge is at a maximum (has the most white space).

One issue is that the size of the character boxes will vary now, but hopefully I will be able to correct this by zero padding the images (or 1 padding...).

Sometimes it works, sometimes it doesn't. Here are some examples of before and after removing the leftover lines. The images on the top half of the figures are after, and on the bottom are before.

Good example of 'a':


Good 'b' example:


Bad 'a' example: