Computing local alignment

Next: End free-space alignment Up: Local Alignment Previous: Motivation

Computing local alignment

Given a pair of indices 37#37 and 38#38 , the local suffix alignment problem is to find a (possibly empty) suffix 39#39 of 40#40 and a (possibly empty) suffix 41#41 of 42#42 such that the score of their alignment is the maximum over all scores of alignments of suffixes of 43#43 and 42#42. We use V(i, j) to denote the value of the optimal local suffix alignment for a given pair i, j of indices. We choose the weights of the editing operations as:

44#44

The algorithm needs to:

1.: Find maximum similarity between suffixes of 43#43 and 42#42.
2.: Discard the prefixes 45#45 whose similarity is 46#46 0, and therefore decreases the overall similarity.
3.: Find the best indices i^*, j^* of S and T respectively after which the similarity only decreases.

Note that any extension of the optimal solution either to the right of to the left decreases the overall similarity. Recursive definition: The base condition will be: V(i, 0) = 0 and 47#47 since we can always choose an empty suffix. For i > 0 and j > 0 the proper recurrence for V(i, j) is

48#48

Compute i^*, j^* so that:

49#49

Observe that the recurrence for computing local suffix alignment is almost identical to the one used for computing global alignment. The only difference is the inclusion of zero in the case of local suffix alignment. In both global alignment and local suffix alignment of prefixes 43#43 and 42#42, the terminating characters of any alignment are specified, but in the case of local suffix alignment, any number of initial characters can be ignored. The zero in the recurrence implements this, 'restarting' the recurrence. Adding 0 to the maximization makes sure that negative prefixes are discarded from the computation. Adding the '0' to the constraint only handles mismatched prefixes, there's still a need to determine, when should a computation of a transformation be stopped, so that the similarity value will not decrease. Therefore, after computing the table of V(i, j) values, and there's a need to search for a cell with the maximal value and ignore all table entries from that point on.
50#50
As usual, pointers are created while filling in the values of the table. After cell (i^*, j^*) is found, the subsequences 39#39 and 41#41 giving the optimal local alignment of S and T are found by tracing back the pointers from cell (i^*, j^*) until reaching an entry (i', j') that has value zero. Then the optimal local alignment subsequences are 51#51 and 52#52. As it seems from here, space complexity will be O(mn), we will show that only O(m) space is needed:
53#53

54#54
Complexity:

Time complexity. Since it takes constant number of operation per cell to compute V(i, j), it takes only O(mn) time to fill in the entire table. The search for V(i*, j*) requires only O(nm) time as well. Hence the total time complexity is O(nm).
Space complexity. As shown in lemma local alignment, the space complexity is O(m).

Next: End free-space alignment Up: Local Alignment Previous: Motivation

Peer Itsik
2000-11-20