next up previous
Next: Temporal Gene Expression Patterns Up: Analyzing Gene Expression Data Previous: Analyzing Gene Expression Data

   
Introduction

In any living cell undergoing any biological process, different subsets of its genes are expressed in different stages of the process. The particular genes expressed at a given stage and their relative abundance are crucial to the cell's proper function. Measuring gene expression levels in different stages, different body tissues, and different organisms is instrumental in understanding biological processes. Such information can help the characterization of gene/function relationships, the determination of effects of experimental treatments for diseases, and the understanding of many other molecular biological processes. One of the approaches to measuring gene expression profiles is hybridization based arrays. According to this approach, a set of oligos is immobilized on a surface to form the hybridization array. When a labeled target DNA mixture, which was sampled in a specific condition (stage, tissue, organism, etc.), is introduced to the array, target sequences hybridize to complementary immobilized molecules. The resulting hybridization intensity (detected, for example, by fluorescence) is indicative of the mixture's content and of the relative genes expression measures in the tested condition. Different conditions are tested, and eventually, every gene has its own profile, i.e., vector of expression intensities, corresponding to the different conditions. Clustering techniques are used to identify subsets of genes that behave similarly under the set of tested conditions. Analyzing multi-conditional gene expresion patterns with clustering algorithms involves the following steps:
1.
Measuring gene expression levels, reported as a vector of real numbers.
2.
Computing a similarity matrix for the genes (e.g., correlations).
3.
Clustering the genes based on their similarity to each other.
4.
Visual representation of the clusters.
5.
Analysis of the results.
A specific clustering algorithm, CAST (for Cluster Affinity Search Technique) [18], which is based on a graph theoretic approach, and uses a stochastic model of the input, was tried.
next up previous
Next: Temporal Gene Expression Patterns Up: Analyzing Gene Expression Data Previous: Analyzing Gene Expression Data
Itshack Pe`er
1999-03-16