Next: End free-space alignment
Up: Local Alignment
Previous: Motivation
Given a pair of indices
37#37
and
38#38
, the local suffix alignment problem is
to find a (possibly empty) suffix 39#39
of
40#40
and a
(possibly empty) suffix 41#41
of
42#42
such that the
score of their alignment is the maximum over all scores of
alignments of suffixes of
43#43
and
42#42.
We use V(i,
j) to denote the value of the optimal local suffix alignment for
a given pair i, j of indices.
We choose the weights of the editing operations as:
44#44
The algorithm needs to:
- 1.
- Find maximum similarity between suffixes of
43#43
and
42#42.
- 2.
- Discard the prefixes
45#45
whose similarity is 46#46
0, and therefore decreases the overall similarity.
- 3.
- Find the best indices i*, j* of S and T respectively after which the similarity only decreases.
Note that any extension of the optimal solution either to the right of to the left decreases the overall similarity.
Recursive definition:
The base condition will be:
V(i, 0) = 0 and
47#47
since we can always choose an empty suffix.
For i > 0 and j > 0 the proper recurrence for V(i, j)
is
48#48
Compute i*, j* so that:
49#49
Observe that the recurrence for computing local suffix alignment is almost identical to the one used for computing global alignment. The only difference
is the inclusion of zero in the case of local suffix alignment. In both global alignment and local suffix alignment of prefixes
43#43
and
42#42,
the terminating characters of any alignment are specified, but in the case of local suffix alignment, any number of initial characters can be ignored.
The zero in the recurrence implements this, 'restarting' the recurrence. Adding 0 to the maximization makes sure that negative prefixes are discarded from the computation.
Adding the '0' to the constraint only handles mismatched prefixes, there's still a need to determine, when should a computation of a transformation be stopped, so that the similarity value will not decrease.
Therefore, after computing the table of V(i, j) values, and there's a need to search for
a cell with the maximal value and ignore all table entries from that point on.
50#50
As usual, pointers are created while filling in the values of the table. After cell
(i*, j*) is found, the subsequences 39#39
and 41#41
giving the optimal local
alignment of S and T are found by tracing back the pointers from cell
(i*, j*) until reaching an entry
(i', j') that has value zero. Then the optimal local alignment subsequences are
51#51
and
52#52.
As it seems from here, space complexity will be O(mn), we will show that only O(m) space is needed:
53#53
54#54
Complexity:
- Time complexity. Since it takes constant number of operation per cell to compute
V(i, j), it takes only O(mn) time to fill in the entire
table. The search for V(i*, j*) requires only O(nm) time as well. Hence the total time complexity is O(nm).
- Space complexity. As shown in lemma local alignment, the space complexity is O(m).
Next: End free-space alignment
Up: Local Alignment
Previous: Motivation
Peer Itsik
2000-11-20