Wednesday, March 4, 2009

Well, I ran it already

Finally, I got everything together and ran my algorithm. It did horribly. We're talking 0% accuracy horrible. I am using the same hidden markov model (hmm) as last year and I think that the model is just too unforgiving. Once it mispredicts the first letter, we're basically screwed.

I tried abandoning the hmm and for each letter in the test name, find the prediction of that letter, independent of the rest of the letters in the name. I found the prediction by taking the max over all confidences for each of the 26 possibilities. Then, in the end, I find the nearest neighbor of the predicted name with all of the names in the roster. This also did horribly.

I took a look at what was going on under the covers. I found that for each letter, the confidence of that letter being the right letter is pretty high compared to the rest of the confidences, but some other letter always beats it by a little bit. And my algorithm doesn't care about 2nd place. Therefore, I need another way!

No comments: