IEEE J Biomed Health Inform. 2013 Jan;17(1):232-9. doi: 10.1109/TITB.2012.2222654. Epub 2012 Oct 4.
Disease outbreaks due to contaminated food are a major concern not only for the food-processing industry but also for the public at large. Techniques for automated detection and classification of microorganisms can be a great help in preventing outbreaks and maintaining the safety of the nations food supply. Identification and classification of foodborne pathogens using colony scatter patterns is a promising new label-free technique that utilizes image-analysis and machine-learning tools. However, the feature-extraction tools employed for this approach are computationally complex, and choosing the right combination of scatter-related features requires extensive testing with different feature combinations. In the presented work we used computer clusters to speed up the feature-extraction process, which enables us to analyze the contribution of different scatter-based features to the overall classification accuracy. A set of 1000 scatter patterns representing ten different bacterial strains was used. Zernike and Chebyshev moments as well as Haralick texture features were computed from the available light-scatter patterns. The most promising features were first selected using Fishers discriminant analysis, and subsequently a support-vector-machine (SVM) classifier with a linear kernel was used. With extensive testing we were able to identify a small subset of features that produced the desired results in terms of classification accuracy and execution speed. The use of distributed computing for scatter-pattern analysis, feature extraction, and selection provides a feasible mechanism for large-scale deployment of a light scatter-based approach to bacterial classification.
由于污染食物而引发的疾病爆发不仅是食品加工行业的主要关注点,也是广大公众关注的焦点。自动检测和分类微生物的技术对于预防疾病爆发和维护国家食品安全供应可以提供很大的帮助。使用菌落散射模式来识别和分类食源性病原体是一种有前途的新型无标记技术,它利用图像分析和机器学习工具。然而,这种方法所采用的特征提取工具计算复杂,选择正确的散射相关特征组合需要对不同的特征组合进行广泛测试。在本工作中,我们使用计算机集群来加速特征提取过程,这使我们能够分析不同基于散射的特征对整体分类准确性的贡献。使用了一组代表十种不同细菌菌株的 1000 个散射模式。从可用的光散射模式中计算了 Zernike 和切比雪夫矩以及 Haralick 纹理特征。首先使用 Fisher 判别分析选择最有前途的特征,然后使用带有线性核的支持向量机 (SVM) 分类器。通过广泛的测试,我们能够确定一小部分特征,这些特征在分类准确性和执行速度方面达到了预期的结果。使用分布式计算进行散射模式分析、特征提取和选择为大规模部署基于光散射的细菌分类方法提供了可行的机制。