Next: Protein Structural Alignment -
Up: Protein Structure Introduction
Previous: Methods for Proteins Folding
The main idea is to find all protein structures using Homology Modeling approach.
The number of protein sequences known so far is much higher
than the number of proteins with known 3D structure.
We also know that the number of possible protein folds is relatively small
(there are about 600-700 different folds among the 10,000 PDB structures).
Proteins within the same family usually have the same fold.
Under these assumptions the Structural Genomics Project works as follows.
The space of protein sequences is divided to clusters according to sequence
similarity. If there is a protein with known 3D structure in the cluster,
than the other structures are determined using homology modeling.
If all the structures are unknown, the most ``appropriate'' protein for
crystallization is selected from the cluster and its structure is determined
by X-ray crystallography.
Peer Itsik
2001-03-04