next up previous
Next: Nomenclature Up: Pairwise Alignment Previous: Motivation

   
Similarity and Difference

The resemblance of two DNA sequences taken from different organisms can be explained by the theory that all contemporary genetic material has one common ancestral DNA. According to this theory, during the course of evolution mutations occurred, creating differences between families of contemporary species. Most of these changes are due to local mutations, each modifying the DNA sequence at a specific manner. These local modifications between nucleotide sequences, or more generally, between strings over an arbitrary alphabet can be either: Insertion and deletion are the reverse of one another: given two sequences, if the insertion of a character (or more) into one yields the other, then equivalently its deletion from the latter sequence transforms it to the first one. Due to this reciprocity between insertion and deletion, they are usually called indel for short. The notion of distance derives its definition from the concept of mutations by assigning weights to each mutation: Given two sequences, the distance between them is the minimal sum of weights for a set of mutations transforming one into the other. The notion of similarity derives its definition from the concept of one ancestral ancient DNA: by assigning weights corresponding to resemblance. Given two sequences the similarity between them is the maximal sum of such weights.

 
next up previous
Next: Nomenclature Up: Pairwise Alignment Previous: Motivation
Peer Itsik
2000-11-20