next up previous
Next: Similarity and Difference Up: Problem Definition and Biological Previous: Problem Definition and Biological

Motivation

A large variety of the biologically motivated problems in computer science primarily involve sequences or strings. For instance: Many of these research problems aim at learning about functionality or the structure of protein without performing any experiments and actually without having to physically construct the protein itself. The basic idea is that similar sequences produce similar proteins. Thus, in order to predict the characteristics of a protein using only its sequence data, we can use the structure/function information on known protein with similar sequences available in databases. For instance, when considering protein folding, it usually suffices that two protein sequences are identical at 25% of their positions for their three dimensional structures to be almost identical. Classical example is the establishment of an association between cancer and uncontrolled cell growth [1]. This discovery was enabled by comparing the sequence of a cancer associated gene against the sequence of proteins which had already been known as influencing the cell growth. The correlation between these two sequences was very high, proving the connection between cancer and cellular growth.
next up previous
Next: Similarity and Difference Up: Problem Definition and Biological Previous: Problem Definition and Biological
Peer Itsik
2000-11-20