Next: Length Distribution
Up: Gene Finding in Eukaryotes
Previous: Markov Sequence Models
Figure shows a consensus sequence
for a typical eukaryote gene. We can see that in the exon-intron
junctions there is great similarity to the consensus sequence
(i.e. the frequencies there are close to 100%).
splice junctions [].
The figure shows an anchor point in the intron called branch
point that appears frequently. Another statistical characteristic
is a pyrimidine (bases C,T) rich area that appears between the
branchpoint and the acceptor site. This naturally leads to using
algorithms based on position specific weight matrices. This idea
does not exploit all the information (reading frames, intron/exon
states, etc) and is not suitable for short genes. Consequently, we
will look for integrated approaches.
Peer Itsik