Monday, March 10, 2008

Emission probabilities

An emission probability in my case is P(image|letter) ... the probability of the image given that you know it is of a particular character. In other words, the probability of a certain letter looking like this. The way I've been calculating this probability is P(letter|image), since I've been doing my own version of the nearest neighbor density. P(letter|image ) != P(letter|image).. this was wrong! I can use Bayes' rule to turn the probability around though:

P(image|letter) = P(letter|image)P(image) / P(letter)

I don't know what to plug in for P(image) so for now I've just been using 1 - each image has equal probability. For P(letter), this is easy to find - just look at how many times each letter appears in the roster and divide by the total number of characters. Since P(image) = 1, my new equation is:

P(image|letter) = P(letter|image) / P(letter)

I noticed that if a letter has high probability, P(image|letter) will go down (compared to a less likely character. The probability will go up because the denominator will always be <= 1) and if P(letter) is very low then P(image|letter) will be very high.

My results became worse once I took this new probability into account.

1 comment:

Voyager31 said...

Hi there, I am doing my project of character recognition using HMM.I am using Matlab for the same, but don't have any idea that what will be the input & how the character will get recognised.Since very little study material is available on HMM.
Kindly Suggest any books on HMM for character recognition.If possible pls pload your code.
Thanks.
Prasenjeet