Next: Multiple Alignment with Profile
Up: Profile Alignment
Previous: Aligning Sequences to a
In the previous section we modeled the problem of aligning a
string to a profile. As with general HMMs, the main problem is to
assign meaningful values to the transition and emission
probabilities to a profile HMM. It is possible to use the
Baum-Welch algorithm for training the model probabilities, but it
first has to be shown how to compute the forward and backward
probabilities needed for the algorithm.
Given a string
we define:
- The forward probabilities:
- The backward probabilities:
Computing the Forward Probabilities:
- 1.
- Initialization:
- 2.
- Recursion:
![\begin{displaymath}\begin{split}
f_{j}^{M}(i) = e_{M_{j}}(x_{i}) \, \cdot \,
[...
...}+ \\
&f^{D}_{j-1}(i-1)\cdot a_{D_{j-1},M_{j}}]
\end{split} \end{displaymath}](img123.gif) |
(56) |
![\begin{displaymath}\begin{split}
f^{I}_{j}(i) = e_{I_{j}}(x_{i}) \, \cdot \,
[...
...I_{j}}+\\
&f^{D}_{j}(i-1)\cdot a_{D_{j},I_{j}}]
\end{split} \end{displaymath}](img124.gif) |
(57) |
![\begin{displaymath}\begin{split}
f^{D}_{j}(i) = \; &f^{M}_{j-1}(i)\cdot a_{M_{j...
..._{j}}+\\
&f^{D}_{j-1}(i)\cdot a_{D_{j-1},D_{j}}
\end{split} \end{displaymath}](img125.gif) |
(58) |
Computing the Backward Probabilities:
- 1.
- Initialization:
- 2.
- Recursion:
![\begin{displaymath}\begin{split}
b_{j}^{M}(i) = \; &b^{M}_{j+1}(i+1)\cdot a_{M_...
...+1})+ \\
&b^{D}_{j+1}(i)\cdot a_{M_{j},D_{j+1}}
\end{split} \end{displaymath}](img126.gif) |
(62) |
![\begin{displaymath}\begin{split}
b^{I}_{j}(i) = \; &b^{M}_{j+1}(i+1)\cdot a_{I_...
...i+1})+\\
&b^{D}_{j+1}(i)\cdot a_{I_{j},D_{j+1}}
\end{split} \end{displaymath}](img127.gif) |
(63) |
![\begin{displaymath}\begin{split}
b^{D}_{j}(i) = \; &b^{M}_{j+1}(i+1)\cdot a_{D_...
...i+1})+\\
&b^{D}_{j+1}(i)\cdot a_{D_{j},D_{j+1}}
\end{split} \end{displaymath}](img128.gif) |
(64) |
The forward and backward variables can then be combined to re-estimate emission and transition probability parameters as follows:
Baum-Welch re-estimation equations fo profile HMMs:
- 1.
- Expected emission counts from sequence X:
![\begin{displaymath}\begin{split}
E_{M_{k}}(a)=\frac{1}{P(X)}\sum_{i\vert x_{i}=a}{f_{k}^{M}(i)b_{k}^{M}(i)}
\end{split} \end{displaymath}](img129.gif) |
(65) |
![\begin{displaymath}\begin{split}
E_{I_{k}}(a)=\frac{1}{P(X)}\sum_{i\vert x_{i}=a}{f_{k}^{I}(i)b_{k}^{I}(i)}
\end{split} \end{displaymath}](img130.gif) |
(66) |
- 2.
- Expected transition counts from sequence x:
![\begin{displaymath}\begin{split}
A_{X_{k}M_{k+1}}=\frac{1}{P(X)}\sum_{i}{f^{X}_{...
..._{k}M_{k+1}}e_{M_{k+1}}(x_{i+1})b^{M}_{k+1}(i+1)}
\end{split} \end{displaymath}](img131.gif) |
(67) |
![\begin{displaymath}\begin{split}
A_{X_{k}I_{k}}=\frac{1}{P(X)}\sum_{i}{f^{X}_{k}(i)a_{X_{k}I_{k}}e_{I_{k}}(x_{i+1})b^{I}_{k}(i+1)}
\end{split} \end{displaymath}](img132.gif) |
(68) |
![\begin{displaymath}\begin{split}
A_{X_{k}D_{k+1}}=\frac{1}{P(X)}\sum_{i}{f^{X}_{k}(i)a_{X_{k}D_{k+1}}b^{D}_{k+1}(i)}
\end{split} \end{displaymath}](img133.gif) |
(69) |
Next: Multiple Alignment with Profile
Up: Profile Alignment
Previous: Aligning Sequences to a
Peer Itsik
2000-12-19