next up previous
Next: Problem Statement Up: Constructing Physical Maps from Previous: Introduction

  
The Statistical Model

We now present the work by Mayraz and Shamir [5] on the physical mapping from noisy data. The statistical model used for the above mapping method assumes the following:
1.
Clones are uniformly and independently distributed along the target genome.
2.
Clones are of equal length.
3.
Probe occurrences along the genome are modeled by a Poisson process.
4.
The Poisson rate is identical for all probes.
5.
The noise statistically behaves as follows:
The hybridization scenario is shown by figure 9.8. The clones are the horizontal lines. The random occurrences of a single non­unique probe are marked by the dotted vertical lines. We denote by A the probe ­- clone occurrence matrix: Ai,j = k if probe j occurs k times in clone i. The probe in this example occurs 3 times along the 7 clones in this genomic region, so its column in the occurrence matrix would be (1, 1, 0, 0, 1, 2, 0). The probability of j occurring k times in i is given by:

\begin{displaymath}Pr(A_{i,j} = k) = \frac{(\lambda l)^{k} e^{- \lambda l}}{k!}
\end{displaymath} (7)

We denote by B the probe ­ clone hybridization matrix: Bi,j = 1 or Bi,j = 0 depending on whether probe j hybridized with clone i or not. The vector $\overrightarrow{B_{i}}$ of the hybridizations of clone i with all the probes is also called its hybridization fingerprint. In case no noise is present hybridization occurs iff there is at least one occurrence of the probe. In this case the appropriate column of B would be (1, 1, 0, 0, 1, 1, 0). Experimental noise can result in both false positive hybridizations (Bi,j = 1 when Ai,j = 0), and false negative hybridizations (Bi,j = 0 when Ai,j > 0).



Hybridization fingerprints of intersecting clones are correlated. This fact is used in order to estimate the clone pairs overlap. Although noise reduces the correlation between fingerprints of overlapping clones, Bayesian inference can still be used to identify overlap, provided a sufficient number of probes is used. It may also be the case that "soft decision" hybridization signals are available. Such signals provide more information on probe occurrences than binary signals do. This continuous signal value does not directly correspond to the hybridization probability, and we have chosen to assume a threshold is used to transform the hybridization signal into a binary one. We therefore define the hybridization matrix B to be a binary matrix, such that Bi,j = 1 if probe j has produced a positive hybridization signal with clone i. The matrix B is the actual experimental data, which is the input for the construction algorithm. The matrix contains noise and no information on multiplicities. Using the statistical model we can write the following equation:

\begin{eqnarray*}Pr(B_{i,j} =1 \vert A_{i,j}) &= & Pr(\mbox{false positive}) + \...
...ha^{A_{i,j}})\\
&= & 1-e^{- \beta l \alpha} \alpha^{A_{i,j}}
\end{eqnarray*}



next up previous
Next: Problem Statement Up: Constructing Physical Maps from Previous: Introduction
Peer Itsik
2001-01-09