This problem is relatively easy to solve. First of all, it is clear that we can solve it for each character separately, characters being mutually independent. For a single character, we will present the following algorithm:
Fitch's algorithm [6]:
Input: A phylogenetic tree T, with n nodes, and a single character c with a set A of k possible values. Denote the value of the character for node v by vc.
Step 1: We will assign to each node v a set
,
in the following fashion:
The result of this algorithm is a fully-labeled tree. The number of changes in this tree is equal to the number of times
was empty, in step 1.
Complexity: For each node v we work O(k) time to compute Sv, and again O(k) to compute vc. Total -
time (step 2 can be performed in only O(n) total time in the average case).
The above algorithm works with a single character. To obtain the optimal score and labeling for the entire data, simply apply the algorithm once for each character. This leads to an overall complexity of
.
It is not very clear at first sight why this algorithm works. We will next present a generalization of the Fitch algorithm, that is perhaps easier to understand.