Next: Temporal Gene Expression Patterns
Up: Analyzing Gene Expression Data
Previous: Analyzing Gene Expression Data
Introduction
In any living cell undergoing any biological process, different subsets of its
genes are expressed in different stages of the process. The particular genes
expressed at a given stage and their relative abundance are crucial to the
cell's proper function. Measuring gene expression levels in different stages,
different body tissues, and different organisms is instrumental in
understanding biological processes. Such information can help the
characterization of gene/function relationships, the determination of effects
of experimental treatments for diseases, and the understanding of many other
molecular biological processes.
One of the approaches to measuring gene expression profiles is hybridization
based arrays. According to this approach, a set of oligos is immobilized on a
surface to form the hybridization array. When a labeled target DNA mixture,
which was sampled in a specific condition (stage, tissue, organism, etc.), is
introduced to the array, target sequences hybridize to complementary
immobilized molecules. The resulting hybridization intensity (detected, for
example, by fluorescence) is indicative of the mixture's content and of the
relative genes expression measures in the tested condition. Different
conditions are tested, and eventually, every gene has its own profile,
i.e., vector of expression intensities, corresponding to the different
conditions.
Clustering techniques are used to identify subsets of genes that behave
similarly under the set of tested conditions. Analyzing multi-conditional gene
expresion patterns with clustering algorithms involves the following steps:
- 1.
- Measuring gene expression levels, reported as a vector of real numbers.
- 2.
- Computing a similarity matrix for the genes (e.g., correlations).
- 3.
- Clustering the genes based on their similarity to each other.
- 4.
- Visual representation of the clusters.
- 5.
- Analysis of the results.
A specific clustering algorithm, CAST (for Cluster Affinity Search
Technique) [18], which is based on a graph theoretic approach,
and uses a stochastic model of the input, was tried.
Next: Temporal Gene Expression Patterns
Up: Analyzing Gene Expression Data
Previous: Analyzing Gene Expression Data
Itshack Pe`er
1999-03-16