Palash's GSoc Blog : Hidden Markov Models

In the previous week, I have described about what dynamic bayesian network. I have also written the

basic framework that the dynamic bayesian network encompasses. Currently, I am planning the hidden markov model framework. The functionality of the hidden markov model, will be similar to that of matlab.
1st week report
I have finished off coding the dynamic bayesian network that has the methods unrolling and removing time slices.

What are hidden markov models(HMM)?

They are basically dynamic bayesian networks on a smaller scale, which makes them lesser complex and more efficient. The above example, consists of nodes Q1 and Q2 which are known as hidden nodes while the nodes Y1 and Y2 are known as the observed nodes.
The hidden states are represented by discrete random variables. Again this is a 2-TBN similar to the dynamic bayesian networks.

The probabilities of the observed nodes are given by the formula.
$P(Z_t|Z_{t-1}) = \prod_{i = 1}^{N} P({Z_t}^{i}| Pa({Z_t}^{(i)})) $
where $Z_t$ is the $i^th$ node in the time slice $t$ also ( $N = N_h + N_o$) and $Pa({Z_t}^i)$ are the parents of the $Z_t^i$ node which may be in the same time slice or the previous one.
The above figure shows a 2-TBN. The probabilities can be given as follows:-
$P(Q_2, Y_2) = P(Q_2|Q_1)\times P(Q_1) \times P(Y_1|Q_1)$
The general equation becomes:
$P(Q_{1:T}, Y_{1:T}) = P(Q_1)P(Y_1|Q_1) \times \prod_{t= 2}^{T}P(Q_t|Q_{t-1})P(Y_t|Q_t)$
There are more standard sophisticated models such as the coupled HMM's
which look like this

Now in the above chain there are several hidden variables that are interacting with each other, with the observed variables $Y_1,Y_2$ and $Y_3$
The chains interact directly with each other affecting the adjacent chains. So, there might be a possibility of $X_2'$ interacting with the 1st chain if unrolled further.
The coupled HMM's are usually used in detecting temperatures for fire alarms and help in explicitly encoding the relation between them.
Another application, that the HMM helps in understanding is the most probable states in a sequence of time slices. The maximum probable state is then found by the popular Viterbi Algorithm.
The states are found out as follows:-
\begin{array}{rcl}
V_{1,k} &=& \mathrm{P}\big( y_1 \ | \ k \big) \cdot \pi_k \\
V_{t,k} &=& \max_{x \in S} \left( \mathrm{P}\big( y_t \ | \ k \big) \cdot a_{x,k} \cdot V_{t-1,x}\right)
\end{array}
Basically in every state, the maximum probable state is always chosen and then the probabilities of the maximum probable state is found out using the simplistic chain rule.
The algorithm is remarkably similar to the one used in belief propagation used in bayesian model.
These algorithms make hidden markov model remarkably significant, and makes them of serious application to speech recognition and convolutional codes.
This is what I am planning to implement within the next 2 weeks. Then I will start with inference in dynamic bayesian networks.

Palash's GSoc Blog

Wednesday, 3 June 2015

Hidden Markov Models

No comments:

Post a Comment