Next: Proteins
Up: Genetic information
Previous: The Genetic Code
The Gene Finding Problem
Problem 2
Given a DNA sequence, predict the location of genes (open reading frames), exons and introns.
A simple solution might be seeking stop codons in the section.
Clearly, if several stop codons exists close to each other in a
section, the section cannot be a gene, since it would have been
terminated. When a relatively long sequence does not contain stop
codons, it becomes more probable that it contains a gene. The
problem becomes more complex in eukaryotic DNA due to the
existence of interleaved exons and introns. In that case, a stop
codon does not indicate that the sequence is not in a gene, but
merely that the sequence is not in an exon. Further complications
arise from the fact that a certain DNA sequence can be interpreted
in 6 different ways: 3 different offsets for each of the possible
'starting points' (the reading frame of the codons) and two for
the reading directions. It is safe to assume that in most cases,
apart from prokaryotic species, a DNA section will encode only one
gene.
Itshack Pe`er
1998-12-27