The Basic CLICK Algorithm

Next: Refinements Up: The CLICK Algorithm Previous: Probabilistic Assumptions

The Basic CLICK Algorithm

The CLICK algorithm represents the input data as a weighted similarity graph G = (V,E). In this graph vertices correspond to elements and edge weights are derived from the similarity values. The weight w_ij of an edge (i,j) reflects the probability that i and j are mates, and is set to be

$\begin{displaymath}w_{ij}=\log{\frac{p_{mates}f(S_{ij}\vert\textrm{$i$ ,$j$\spac... ...{mates})f(S_{ij}\vert\textrm{$i$ ,$j$\space are non-mates})}} \end{displaymath}$

where $f(S_{ij}\vert\textrm{$i$ ,$j$\space are mates})=f(S_{ij}\vert\mu_T,\sigma_T)$ is the value of the probability density function for mates at S_ij:

$\begin{displaymath}f(S_{ij}\vert\textrm{$i$ ,$j$\space are mates})=\frac{1}{\sqrt{2\pi}\sigma_T} e^{-\frac{(S_{ij}-\mu_T)^2}{2\sigma_T^2}} \end{displaymath}$

Similarly, $f(S_{ij}\vert\textrm{$i$ ,$j$\space are non-mates})$ is the value of the probability density function for non-mates. The basic CLICK algorithm is defined in figure 11.13.

**Figure 11.13:** The Basic-CLICK algorithm
$\framebox{ { \begin{minipage}{\textwidth} \begin{tabbing} \ \ \ \ \= \ \ ... ...bf end\ if\ }{} \- \\ {\small\bf end} \end{tabbing} \end{minipage} } }$

The idea behind the algorithm the following: given a connected graph G, we would like to decide whether V(G) is a subset of some true cluster, or V(G) contains elements from at least two true clusters. In the first case we say that G is pure. In order to make this decision we test for each cut C in G the following two hypotheses:

H₀^C: C contains only edges between non-mates.
H₁^C: C contains only edges between mates.

G is declared a kernel if H₁ is more probable for all cuts. Using the following lemma (11.6), we can simply calculate the minimum weighted cut to determine whether G is a kernel.
$\begin{lemma} $G$\space is a kernel iff $MinWeightCut(G)>0$ . \end{lemma}$

$\begin{proof}Using Bayes Theorem, it can be shown that \begin{displaymath} W(C... ...rt C) \leq Pr(H_0^C\vert C)$ , therefore $G$\space is not a kernel. \end{proof}$

Next: Refinements Up: The CLICK Algorithm Previous: Probabilistic Assumptions

Peer Itsik
2001-01-31