Thursday, January 31, 2008
Wednesday, January 30, 2008
I got some training data
Fabian gave me 18 filled out handwriting sheets, which is very helpful! I have been parsing those, and have been getting better results. Here is an example of another filled out sheet - that was scanned in at an angle. Luckily, that isn't a problem!
Here are examples of the a's from this image:
Here are examples of the b's from this image:
Here are examples of the a's from this image:
Here are examples of the b's from this image:
Monday, January 28, 2008
Example of cut out training data
Example of training data
Here is an example of the training data I'll be using for character classification. It is called ABCDETC and is from NEC Labs, available here
I am using the same parsing method that I am using for the scanned names. Here non maximal suppression actually works!
I am using the same parsing method that I am using for the scanned names. Here non maximal suppression actually works!
Friday, January 25, 2008
Non maximal suppression not really working
My plan was to find the top vertical line found from the hough transform (the one with the most votes) and then take the angle from that line and assume that that angle is the angle of the remaining vertical lines that I'm interested. So at that point I take the column vector of the hough space corresponding to that angle and sort it in descending order and plot the top 20 lines found. I also implemented non maximal suppression in the following way: for each line that I'm about to plot, I check if the amount of votes for that line is the maximum in that cell's surrounding window (just looking at changes in rho, not theta). If it isn't a maximum then I skip it - I don't plot that line.
I've been having a hard time playing around with the window threshold - how big should the window be? When it is too big then I start missing lines that I need but when it is too small then it doesn't get rid of enough lines. The following is a perfect example. I was playing with the threshold for the following image and this was the best I could get it. It didn't get rid of the additional vertical line between the e and the a, while it missed the line between the n and the space.
I've been having a hard time playing around with the window threshold - how big should the window be? When it is too big then I start missing lines that I need but when it is too small then it doesn't get rid of enough lines. The following is a perfect example. I was playing with the threshold for the following image and this was the best I could get it. It didn't get rid of the additional vertical line between the e and the a, while it missed the line between the n and the space.
Isolated letters
These are what the letters look like when I cut them out of the image based on the lines found by the hough transform:
My next step is to write a function that takes in one of these letter images and determines whether or not there is a letter inside. I also need to think of a way to implement non maximal suppression!
My next step is to write a function that takes in one of these letter images and determines whether or not there is a letter inside. I also need to think of a way to implement non maximal suppression!
Wednesday, January 23, 2008
Better Isolated Boxes
Tuesday, January 22, 2008
Isolating character boxes
This is where I'm currently at with regards to isolating the character boxes:
This is without rotating the image at all - just taking both the vertical and horizontal gradient, running edge.m on each of those, and then running the hough transform on the result.
Here is a cool image of the hough space for the vertical lines:
I have been trying so hard to rotate the images properly that I forgot my real goal which was to isolate the character boxes. Finally when I was able to rotate the image, it turned out that that did not help me out at all! Here are examples of before and after rotated images:
As you can see, Matlab didn't do a great job cropping the image after rotating it, even though I specified 'crop' to imrotate.m. Oh well, for the time being I'm not using it anyway.
This is without rotating the image at all - just taking both the vertical and horizontal gradient, running edge.m on each of those, and then running the hough transform on the result.
Here is a cool image of the hough space for the vertical lines:
I have been trying so hard to rotate the images properly that I forgot my real goal which was to isolate the character boxes. Finally when I was able to rotate the image, it turned out that that did not help me out at all! Here are examples of before and after rotated images:
As you can see, Matlab didn't do a great job cropping the image after rotating it, even though I specified 'crop' to imrotate.m. Oh well, for the time being I'm not using it anyway.
Wednesday, January 16, 2008
Better angles from Hough
I decreased the size of the cells in the accumulator array that the Hough Transform uses and the accuracy of the angles of the detected lines significantly improved. Yesterday I was using a resolution of 1 radian, but now my resolution is pi/450.
There are still too many lines detected for the upper boxes compared to the lower boxes. Here is what I get when I plot the top 5 lines detected (instead of thresholding):
3 of the 5 lines detected are for the top horizontal line. The 12th line that is detected is the upper horizontal line for the PID. I'm not quite sure why that is. Here is what I get when taking the top 12 lines:
When I get greedy and increase the resolution of the angle too much, I start to hurt from it by failing to detect lines. For example, when I set the resolution to pi/720, my top 20 detected lines are the following:
It doesn't even detect the bottom horizontal line.
This is the process I've been using to detect the lines:
1. Resize the image to half of its size.
2. Take the gradient of the image.
3. Run edge.m on the vertical gradient image, specifying 'sobel' as a parameter.
4. Run the Hough Transform on the result of edge.m
5. Sort the accumulator matrix in descending order and plot the top x lines on top of the original image.
Here is an example vertical gradient of the image:
Here is an example of the output of edge.m, passed into the Hough Transform function:
Finally, here is an example of output from the Hough Transform function:
The white portions in the middle of the image correspond to the lines detected in the image.
The next significant problem is detecting the vertical lines. When I instead use the horizontal gradient, here are the top 20 lines I detect (none of which are useful):
Also, here is the output of the Hough Transform. Note how many white parts there are - meaning lots of lines were detected.
Ok one last thing - here is what is given to the Hough Transform function in the vertical case:
There are still too many lines detected for the upper boxes compared to the lower boxes. Here is what I get when I plot the top 5 lines detected (instead of thresholding):
3 of the 5 lines detected are for the top horizontal line. The 12th line that is detected is the upper horizontal line for the PID. I'm not quite sure why that is. Here is what I get when taking the top 12 lines:
When I get greedy and increase the resolution of the angle too much, I start to hurt from it by failing to detect lines. For example, when I set the resolution to pi/720, my top 20 detected lines are the following:
It doesn't even detect the bottom horizontal line.
This is the process I've been using to detect the lines:
1. Resize the image to half of its size.
2. Take the gradient of the image.
3. Run edge.m on the vertical gradient image, specifying 'sobel' as a parameter.
4. Run the Hough Transform on the result of edge.m
5. Sort the accumulator matrix in descending order and plot the top x lines on top of the original image.
Here is an example vertical gradient of the image:
Here is an example of the output of edge.m, passed into the Hough Transform function:
Finally, here is an example of output from the Hough Transform function:
The white portions in the middle of the image correspond to the lines detected in the image.
The next significant problem is detecting the vertical lines. When I instead use the horizontal gradient, here are the top 20 lines I detect (none of which are useful):
Also, here is the output of the Hough Transform. Note how many white parts there are - meaning lots of lines were detected.
Ok one last thing - here is what is given to the Hough Transform function in the vertical case:
Tuesday, January 15, 2008
Hough is working better but not there yet
The change I made was first taking the gradient of the image before running edge.m on it (and finally the hough transform). This seemed to get rid of a lot of the extra lines towards the top of the image without sacrificing the lines on the bottom of the image.
I still have the problem that the lines that are detected seemed to not be oriented correctly - perhaps the angle is off.
Here is what I get now just looking at the vertical gradient and thresholding at 50% of the accumulator:
The next step is to get the lines detected on target (have the right angle) and also detect the vertical lines (which has proven to be significantly more difficult).
I still have the problem that the lines that are detected seemed to not be oriented correctly - perhaps the angle is off.
Here is what I get now just looking at the vertical gradient and thresholding at 50% of the accumulator:
The next step is to get the lines detected on target (have the right angle) and also detect the vertical lines (which has proven to be significantly more difficult).
Monday, January 14, 2008
Why is the Hough Transform not working?
Ugh... these lines are incorrect!
Thresholding at 50% max value of accumulator:
Finds too little lines and the lines that it finds aren't completely correct imho:
Thresholding at ~44% max value of accumulator:
Finds too many lines for the name portion and not enough lines for the PID portion!:
[Ignore the titles of these images!]
Thresholding at 50% max value of accumulator:
Finds too little lines and the lines that it finds aren't completely correct imho:
Thresholding at ~44% max value of accumulator:
Finds too many lines for the name portion and not enough lines for the PID portion!:
[Ignore the titles of these images!]
Friday, January 11, 2008
Sample Input
Here is an example of an input quiz or assignment. This is the subset of the quiz or assignment that contains the student's name and PID number. Part of the preprocessing will be isolating the name and PID number from the rest of the quiz.
One idea that is up in the air right now is using character boxes for the students to enter their name and PID. This simplifies the problem significantly, as I do not need to figure out myself where one letter ends and the next begins.
Sunday, January 6, 2008
Welcome to Dafna's CSE 190 blog!
I will be doing a project on the recognition of handwritten names from a limited and closed lexicon using a hidden markov model.
Subscribe to:
Posts (Atom)