Finding the Optimal Branch Lengths

Next: Bibliography Up: Maximum Likelihood Previous: Likelihood of a Tree

Finding the Optimal Branch Lengths

We are now ready to tackle the more difficult task of finding the optimal branch lengths for a given tree topology. First, let us assume that all the lengths are known except for t_rv. If r is the root (as in figure 8.11), then we get:

$\begin{displaymath}\log{L} = \sum_{j=1}^{m}\ \log{ \left[\ \sum_{x,y} P(x) \cdot... ... \cdot P_{x \rightarrow y}(t_{rv}) \cdot C_j^r(y,v)\ \right] } \end{displaymath}$

(14)

Notation: C_j^u(x,y) means C_j^u(x,y) in a tree where u is the root.
which is an elementary function of t_rv and some constants.

We now need to maximize $\log{L}$ with respect to t_rv. This can be done by many standard methods, e.g., Newton-Raphson, or EM algorithm. The same process we have just demonstrated can also be applied when r is not the original root. As explained earlier, assuming reversibility, for any x, y, and t, then the root can be set at any node, without affecting L. In other words, in order to find an optimal branch length between nodes r and v, we simply need to hang the tree from r, so that the previous analysis holds.

Our next step is to find optimal branch lengths, when none of them are known apriori. The main problem is that once one branch has changed length, there is no guarantee that the others are still at their optimal lengths. On the contrary, the branches are clearly not pairwise independent. In practice, however, locally improving the likelihood by optimizing the length of one branch at a time works quite well, as there are not very strong interactions between branch lengths. After a few sweeps through the tree, calculating the optimal length of each edge separately, the likelihood converges, and the result is a near-optimal phylogenetic tree.

Next: Bibliography Up: Maximum Likelihood Previous: Likelihood of a Tree

Peer Itsik
2001-01-01