next up previous
Next: Multiple Alignment to a Up: Approximation Algorithms for Multiple Previous: Multiple Alignment with Consensus

   
Consensus Strings from Multiple Alignment

Definition 5.5   Given a multiple alignment ${{\cal M}}$ of a set of strings $\S$, the consensus character in column i of ${{\cal M}}$ is the character that minimizes the summed distance to it from all the characters in column i. Let d(i) denote that minimum sum in column i.

Definition 5.6   The consensus string $S_{{{\cal M}}}$ derived from the alignment ${{\cal M}}$ is the concatenation of the consensus characters for each column of ${{\cal M}}$.

Definition 5.7   The alignment error of $S_{{{\cal M}}}$ equals $\sum_{i=1}^{l}d(i)$ where l is the number of characters in $S_{{{\cal M}}}$

Definition 5.8   The optimal consensus multiple alignment is a multiple alignment ${{\cal M}}$ of an input set $\S$ whose consensus string $S_{{{\cal M}}}$ minimizes the alignment error. It can be shown that the optimal consensus multiple alignment is equal to the optimal Steiner string, as defined in section 5.1.3.

We can use the center string (Sc) for approximating the optimal multiple alignment with an alignment error smaller than $(2 -
\frac{2}{k})$ times the optimal alignment error.

Itshack Pe`er
1999-03-16