Next: Main Algorithms for Database
Up: Biological Databases and Retrieval
Previous: DNA vs. Protein Searches
Specificity and Sensitivity of the Search Tools
- Sensitivity: the ability to detect "true positive" matches. The most sensitive search finds all true matches, but might have lots of false positives
- Specificity : the ability to reject "false positive" matches. The most specific search will return only true matches, but might have lots of
false negatives (see figure 4.1)
When one chooses which algorithm to use, there is a trade off between these two characters. It quite trivial to create an algorithms which will optimize one of these
properties, the problem is to create algorithm which will achieve both of them.
Figure 4.1:
The tradeoff between specificity and sensitivity as is seen from algorithms of alignment scores of true (black) and false (white) sequences - The cases shown:
1) Substantial overlap - Too many true positives are hidden by the background. All cutoffs are bad. A better model is required.
2) Small overlap - A few true positives have lower score than the highest
random matches. An inclusive cutoff and visual inspection usually suffice.
3) Complete separation - All true positives are above the background. A simple cutoff suffices.
|
Next: Main Algorithms for Database
Up: Biological Databases and Retrieval
Previous: DNA vs. Protein Searches
Itshack Pe`er
1999-01-17