For this we shall need some extra definitions.
Forward algorithm: Given a sequence
let us denote by fk(i) the probability of emitting the prefix
and eventually reaching
:
(20) |
We use the same initial values for fk(0) as was done in the
Viterbi algorithm:
fbegin(0) | = | 1 | (21) |
= | 0 | (22) |
In analogy to 6.15 we can use the
recursive formula:
(23) |
We terminate the process by calculating:
(24) |
Backward algorithm: In a complementary manner we denote by
bk(i) the probability of the suffix
given
:
(25) |
In this case, we initialize:
(26) |
The recursive formula is:
(27) |
We terminate the process by calculating:
(28) |
Complexity: All the values of fk(i) and bk(i) can
be calculated in
time and stored in
space, as it is the case with Viterbi algorithm.
There is however one important difference: here we cannot
trivially use the logarithmic weights, since (unlike in Viterbi)
we do not perform only multiplication of probabilities, but we
also sum probabilities. This may lead to numeric stabilization
problems, unless proper measures, such as scaling the
probabilities, are taken.
Using the forward and backward probabilities we can compute the
value of
.
Since the process only has memory of
length 1, there is a dependency only on the last state, so we can
write:
Using the definition of conditional probability, we obtain the
solution to the likelihood problem:
where
(31) |