Next: Refinements
Up: The CLICK Algorithm
Previous: Probabilistic Assumptions
The CLICK algorithm represents the input data as a weighted similarity
graph G = (V,E). In this graph vertices correspond to elements and edge
weights are derived from the similarity values. The weight wij of an
edge (i,j) reflects the probability that i and j are mates, and is set
to be
where
is the value of the probability
density function for mates at Sij:
Similarly,
is the value of the
probability density function for non-mates.
The basic CLICK algorithm is defined in figure 11.13.
Figure 11.13:
The Basic-CLICK algorithm
|
The idea behind the algorithm the following: given a connected graph G, we
would like to decide whether V(G) is a subset of some true cluster, or
V(G) contains elements from at least two true clusters. In the first case
we say that G is pure. In order to make this decision we test for
each cut C in G the following two hypotheses:
- H0C: C contains only edges between non-mates.
- H1C: C contains only edges between mates.
G is declared a kernel if H1 is more probable for all cuts. Using
the following lemma (11.6), we can simply calculate the minimum weighted cut to
determine whether G is a kernel.
Next: Refinements
Up: The CLICK Algorithm
Previous: Probabilistic Assumptions
Peer Itsik
2001-01-31