next up previous
Next: Models for Inexact Matching Up: Pairwise Alignment Previous: Motivation

   
Similarity and Difference

The resemblance of two DNA sequences taken from different organisms can be explained by the theory that all contemporary genetic material has one ancestral ancient DNA. According to this theory, during the course of evolution mutations occurred, creating differences between families of contemporary species. Most of these changes are due to local mutations, each modifying the DNA sequence at a specific manner. These local modifications between nucleotide sequences, or more generally, between strings over an arbitrary alphabet can be either:

Insertion and deletion are the reverse of one another: given two sequences, if the insertion of a character (or more) into one yields the other, then equivalently its deletion from the latter sequence transforms it to the first one. Due to this reciprocity between insertion and deletion, they are usually called indel for short.

The notion of distance derives its definition from the concept of mutations: by assigning weights to each mutation. Given two strings, the distance between them is the minimal sum of weights for a set of mutations transforming one into the other.

The notion of similarity derives its definition from the concept of one ancestral ancient DNA: by assigning weights corresponding for resemblance. Given two strings the similarity between them is the maximal sum of such weights.



 
next up previous
Next: Models for Inexact Matching Up: Pairwise Alignment Previous: Motivation
Itshack Pe`er
1999-01-03