next up previous
Next: Proteins Up: Genetic information Previous: The Genetic Code

   
The Gene Finding Problem

Problem 2   Given a DNA sequence, predict the location of genes (open reading frames), exons and introns.

A simple solution might be seeking stop codons in the section. Clearly, if several stop codons exists close to each other in a section, the section cannot be a gene, since it would have been terminated. When a relatively long sequence does not contain stop codons, it becomes more probable that it contains a gene. The problem becomes more complex in eukaryotic DNA due to the existence of interleaved exons and introns. In that case, a stop codon does not indicate that the sequence is not in a gene, but merely that the sequence is not in an exon. Further complications arise from the fact that a certain DNA sequence can be interpreted in 6 different ways: 3 different offsets for each of the possible 'starting points' (the reading frame of the codons) and two for the reading directions. It is safe to assume that in most cases, apart from prokaryotic species, a DNA section will encode only one gene.

Itshack Pe`er
1998-12-27