Next: Multiple Alignment with Consensus
Up: Approximation Algorithms for Multiple
Previous: Problem Definition
In this section, we present an approximation algorithm for calculating the
optimal multiple alignment under the SP metric
(see e.g. [4] [pp 348-350]). The algorithm
achieves an approximation ratio of two.
Definitions:
Assume
25#25
is our scoring
function i.e., the price of aligning the character x with the character y.
For each x,y and z,
We denote by D(S,Y) the score of the optimal alignment between
sequences S and Y.
31#31
Figure 4.2:
A generic center star for six
strings, where the center string (Sc) is
S3.
32#32 |
The Center Star Algorithm:
- 1.
- Find
33#33
minimizing
34#34
and let
35#35.
- 2.
- Add the sequences in
36#36
to
37#37
one by one so
that the alignment of every newly added sequence with St is
optimal. Add spaces, when needed, to all pre-aligned sequences.
Running time analysis:
- 1.
-
38#38
O(n2) for step 1.
- 2.
-
39#39
for step 2.
(Since the worst-case length of S'c after the addition of i strings is
40#40)
Approximation analysis:
- Let
37#37
denote the multiple alignment of the algorithm.
- Let d(i,j) be the score of the pairwise alignment it induces on Si, Sj.
( Note that
41#41
).
- Let
42#42
- Let
43#43
denote the optimal alignment of 44#44.
- Let
d*(i,j) denote the value of alignment between Si and Sj induced
by
43#43.
We note that
45#45
the SP score of M. We will assume w.l.o.g
that S1 is the center found by the algorithm,
so for each
46#46.
We also note that
47#47
48#48
49#49
Theorem 4.1 implies that calculating the
multiple alignment of the center star produces a multiple alignment
with a value which is at most
50#50
times the value
of the optimal alignment. For example
51#51,
52#52.
Next: Multiple Alignment with Consensus
Up: Approximation Algorithms for Multiple
Previous: Problem Definition
Peer Itsik
2000-12-06