Eltoft T, deFigueiredo R P
Department of Physics, Faculty of Science, University of Tromsø, N-9037 Tromsø, Norway.
IEEE Trans Neural Netw. 1998;9(5):1021-35. doi: 10.1109/72.712183.
We propose in this paper a new unsupervised neural network which is capable of clustering a set of experimental data according to a given generic interpoint similarity measure, and then assign to each new input its appropriate cluster label. The network is able to do this for clusters of any shape, and without knowing in advance the number of clusters to be created. We call this new two-layer network a cluster-detection-and-labeling (CDL) network. In the CDL network the concept of similarity and closeness with regard to distance are combined. Specifically, clusters are represented by a set of prototypes, and the similarities between an input vector and these prototypes are calculated as the inner products of these vectors compared to some thresholds. These thresholds, which depend on the distance between the input vector and the prototype, are calculated in a separate threshold calculating unit. During clustering, the data are cycled through the network several times. At the end of each cycle the clusters are evaluated, and only those with more than a specified number of samples are retained. The others are fed back to be reclustered by an updated network. This process terminates according to a suitable criterion, such as when a prespecified portion of the data are classified. The performance of the CDL network has been compared with that of the winner-take-all (WTA) network for several different cluster structures, since the latter is widely used in cluster analysis applications. These studies demonstrate that the new network performs well for all the tested cluster shapes, also for those cases where the WTA network completely fails.