I am having a tough time in figuring out how to use Kevin Murphy’s
HMM toolbox Toolbox. It would be a great help if anyone who has an experience with it could clarify some conceptual questions. I have somehow understood the theory behind HMM but it’s confusing how to actually implement it and mention all the parameter setting.
There are 2 classes so we need 2 HMMs.
Let say the training vectors are :class1 O1={ 4 3 5 1 2} and class O_2={ 1 4 3 2 4}.
Now,the system has to classify an unknown sequence O3={1 3 2 4 4} as either class1 or class2.
- What is going to go in obsmat0 and obsmat1?
- How to specify/syntax for the transition probability transmat0 and transmat1?
- what is the variable data going to be in this case?
- Would number of states Q=5 since there are five unique numbers/symbols used?
- Number of output symbols=5 ?
- How do I mention the transition probabilities transmat0 and transmat1?
Instead of answering each individual question, let me illustrate how to use the HMM toolbox with an example — the weather example which is usually used when introducing hidden markov models.
Basically the states of the model are the three possible types of weather: sunny, rainy and foggy. At any given day, we assume the weather can be only one of these values. Thus the set of HMM states are:
However in this example, we can’t observe the weather directly (apparently we are locked in the basement!). Instead the only evidence we have is whether the person who checks on you every day is carrying an umbrella or not. In HMM terminology, these are the discrete observations:
The HMM model is characterized by three things:
Next we are either given the these probabilities, or we have to learn them from a training set. Once that’s done, we can do reasoning like computing likelihood of an observation sequence with respect to an HMM model (or a bunch of models, and pick the most likely one)…
1) known model parameters
Here is a sample code that shows how to fill existing probabilities to build the model:
Then we can sample a bunch of sequences from this model:
for example, the 5th example was:
we can evaluate the log-likelihood of the sequence:
or compute the Viterbi path (most probable state sequence):
2) unknown model parameters
Training is performed using the EM algorithm, and is best done with a set of observation sequences.
Continuing on the same example, we can use the generated data above to train a new model and compare it to the original:
Keep in mind that the states order don’t have to match. That’s why we need to permute the states before comparing the two models. In this example, the trained model looks close to the original one:
There are more things you can do with hidden markov models such as classification or pattern recognition. You would have different sets of obervation sequences belonging to different classes. You start by training a model for each set. Then given a new observation sequence, you could classify it by computing its likelihood with respect to each model, and predict the model with the highest log-likelihood.