Wilkins M F, Boddy L, Morris C W, Jonker R
School of Pure and Applied Biology, University of Wales, Cardiff, UK.
Comput Appl Biosci. 1996 Feb;12(1):9-18. doi: 10.1093/bioinformatics/12.1.9.
Four artificial neural network paradigms (multilayer perceptron networks, learning vector quantization networks, and radial and asymmetric basis function networks) and two statistical methods (parametric statistical classification by modelling each class with Gaussian distributions, and non-parametric density estimation via the K-nearest neighbour method) were compared for their ability to identify seven freshwater and five marine phytoplankton species from flow cytometric data. Kohonen self-organizing maps were also used to examine similarities between species. Optimized networks and statistical methods performed similarly, correctly identifying between 86.8% and 90.1% of data from freshwater species, and between 81.3% and 84.1% of data from marine species. Choice of identification technique must therefore be made on the basis of other criteria. We highlight the way each method partitions the data space and thereby separates the data clusters, and discuss the relative merits of each with reference to complexity of data boundaries, training time, analysis time and behaviour when presented with 'novel' data.