利用机器学习技术实现基于脂肪酸甲酯（FAME）的大规模细菌物种鉴定

Towards large-scale FAME-based bacterial species identification using machine learning techniques.

作者信息

Slabbinck Bram, De Baets Bernard, Dawyndt Peter, De Vos Paul

机构信息

Research Unit Knowledge-based Systems, Faculty of Bioscience Engineering, Ghent University, Coupure links 653, 9000 Ghent, Belgium.

出版信息

Syst Appl Microbiol. 2009 May;32(3):163-76. doi: 10.1016/j.syapm.2009.01.003. Epub 2009 Feb 23.

DOI:10.1016/j.syapm.2009.01.003

PMID:19237256

Abstract

In the last decade, bacterial taxonomy witnessed a huge expansion. The swift pace of bacterial species (re-)definitions has a serious impact on the accuracy and completeness of first-line identification methods. Consequently, back-end identification libraries need to be synchronized with the List of Prokaryotic names with Standing in Nomenclature. In this study, we focus on bacterial fatty acid methyl ester (FAME) profiling as a broadly used first-line identification method. From the BAME@LMG database, we have selected FAME profiles of individual strains belonging to the genera Bacillus, Paenibacillus and Pseudomonas. Only those profiles resulting from standard growth conditions have been retained. The corresponding data set covers 74, 44 and 95 validly published bacterial species, respectively, represented by 961, 378 and 1673 standard FAME profiles. Through the application of machine learning techniques in a supervised strategy, different computational models have been built for genus and species identification. Three techniques have been considered: artificial neural networks, random forests and support vector machines. Nearly perfect identification has been achieved at genus level. Notwithstanding the known limited discriminative power of FAME analysis for species identification, the computational models have resulted in good species identification results for the three genera. For Bacillus, Paenibacillus and Pseudomonas, random forests have resulted in sensitivity values, respectively, 0.847, 0.901 and 0.708. The random forests models outperform those of the other machine learning techniques. Moreover, our machine learning approach also outperformed the Sherlock MIS (MIDI Inc., Newark, DE, USA). These results show that machine learning proves very useful for FAME-based bacterial species identification. Besides good bacterial identification at species level, speed and ease of taxonomic synchronization are major advantages of this computational species identification strategy.

摘要

在过去十年中，细菌分类学经历了巨大的扩展。细菌物种（重新）定义的快速步伐对一线鉴定方法的准确性和完整性产生了严重影响。因此，后端鉴定库需要与《原核生物有效名称名录》保持同步。在本研究中，我们专注于细菌脂肪酸甲酯（FAME）谱分析，这是一种广泛使用的一线鉴定方法。我们从BAME@LMG数据库中选择了芽孢杆菌属、类芽孢杆菌属和假单胞菌属单个菌株的FAME谱。仅保留了标准生长条件下产生的那些谱。相应的数据集分别涵盖74、44和95个有效发表的细菌物种，由961、378和1673个标准FAME谱表示。通过在监督策略中应用机器学习技术，建立了用于属和种鉴定的不同计算模型。考虑了三种技术：人工神经网络、随机森林和支持向量机。在属水平上实现了近乎完美的鉴定。尽管已知FAME分析对种鉴定的鉴别力有限，但计算模型对这三个属都取得了良好的种鉴定结果。对于芽孢杆菌属、类芽孢杆菌属和假单胞菌属，随机森林分别产生的灵敏度值为0.847、0.901和0.708。随机森林模型优于其他机器学习技术的模型。此外，我们的机器学习方法也优于Sherlock MIS（美国特拉华州纽瓦克市MIDI公司）。这些结果表明，机器学习被证明对基于FAME的细菌物种鉴定非常有用。除了在种水平上有良好的细菌鉴定效果外，速度和分类同步的简便性是这种计算物种鉴定策略的主要优点。

相似文献

Towards large-scale FAME-based bacterial species identification using machine learning techniques.利用机器学习技术实现基于脂肪酸甲酯（FAME）的大规模细菌物种鉴定

Syst Appl Microbiol. 2009 May;32(3):163-76. doi: 10.1016/j.syapm.2009.01.003. Epub 2009 Feb 23.

Genus-wide Bacillus species identification through proper artificial neural network experiments on fatty acid profiles.通过对脂肪酸谱进行适当的人工神经网络实验实现全属芽孢杆菌物种鉴定。

Antonie Van Leeuwenhoek. 2008 Aug;94(2):187-98. doi: 10.1007/s10482-008-9229-z. Epub 2008 Mar 6.

[Study on species identification of Mycobacteria by gas chromatography analysis of whole-cell fatty acid].[全细胞脂肪酸气相色谱分析法对分枝杆菌菌种鉴定的研究]

Zhonghua Jie He He Hu Xi Za Zhi. 2005 Jun;28(6):403-6.

Artificial neural network identification of heterotrophic marine bacteria based on their fatty-acid composition.基于脂肪酸组成的异养海洋细菌的人工神经网络识别

IEEE Trans Biomed Eng. 1997 Dec;44(12):1185-91. doi: 10.1109/10.649990.

Fame-based Bacillus species identification using artificial neural networks.

Commun Agric Appl Biol Sci. 2006;71(1):259-62.

Bacterial species identification from MALDI-TOF mass spectra through data analysis and machine learning.通过数据分析和机器学习对 MALDI-TOF 质谱进行细菌种属鉴定。

Syst Appl Microbiol. 2011 Feb;34(1):20-9. doi: 10.1016/j.syapm.2010.11.003. Epub 2011 Feb 4.

The use of fatty acid methyl esters as biomarkers to determine aerobic, facultatively aerobic and anaerobic communities in wastewater treatment systems.使用脂肪酸甲酯作为生物标志物来确定废水处理系统中的需氧、兼性需氧和厌氧群落。

FEMS Microbiol Lett. 2007 Jan;266(1):75-82. doi: 10.1111/j.1574-6968.2006.00509.x.

Description of Rummeliibacillus stabekisii gen. nov., sp. nov. and reclassification of Bacillus pycnus Nakamura et al. 2002 as Rummeliibacillus pycnus comb. nov.鲁梅利芽孢杆菌（Rummeliibacillus stabekisii）新属、新种的描述以及将中村等人2002年的致密芽孢杆菌（Bacillus pycnus）重新分类为鲁梅利芽孢杆菌致密亚种（Rummeliibacillus pycnus）新组合

Int J Syst Evol Microbiol. 2009 May;59(Pt 5):1094-9. doi: 10.1099/ijs.0.006098-0.

Application of artificial neural network for the identification of fresh water bacteria.

Stud Health Technol Inform. 2000;77:106-10.

Species identification of corynebacteria by cellular fatty acid analysis.通过细胞脂肪酸分析鉴定棒状杆菌的菌种

Diagn Microbiol Infect Dis. 2006 Feb;54(2):99-104. doi: 10.1016/j.diagmicrobio.2005.08.019. Epub 2006 Jan 19.

引用本文的文献

Proteolytic sp. Isolation and Identification from Tannery Alkaline Baths.从制革碱性浴中分离和鉴定蛋白水解菌。

Molecules. 2025 Sep 5;30(17):3632. doi: 10.3390/molecules30173632.

Identification and Characterization of Pseudomonas syringae pv. syringae, a Causative Bacterium of Apple Canker in Korea.韩国苹果溃疡病致病菌丁香假单胞菌丁香致病变种的鉴定与特性分析

Plant Pathol J. 2023 Feb;39(1):88-107. doi: 10.5423/PPJ.OA.08.2022.0121. Epub 2023 Feb 1.

Identification of airborne bacteria by 16S rDNA sequencing, MALDI-TOF MS and the MIDI microbial identification system.通过16S rDNA测序、基质辅助激光解吸电离飞行时间质谱（MALDI-TOF MS）和MIDI微生物鉴定系统鉴定空气传播细菌。

Aerobiologia (Bologna). 2015;31(3):271-281. doi: 10.1007/s10453-015-9363-9. Epub 2015 Jan 17.

What variables are important in predicting bovine viral diarrhea virus? A random forest approach.在预测牛病毒性腹泻病毒时，哪些变量是重要的？一种随机森林方法。

Vet Res. 2015 Jul 24;46(1):85. doi: 10.1186/s13567-015-0219-7.

New marker of FAME profile of Pseudomonas aurantiaca total lipids.

Dokl Biochem Biophys. 2012 Jul-Aug;445:183-6. doi: 10.1134/S1607672912040011. Epub 2012 Sep 2.

Data mining in the Life Sciences with Random Forest: a walk in the park or lost in the jungle?生命科学中的随机森林数据挖掘：是漫步公园还是迷失丛林？

Brief Bioinform. 2013 May;14(3):315-26. doi: 10.1093/bib/bbs034. Epub 2012 Jul 10.

From learning taxonomies to phylogenetic learning: integration of 16S rRNA gene data into FAME-based bacterial classification.从学习分类学到系统发育学习：将 16S rRNA 基因数据整合到基于 FAME 的细菌分类中。

BMC Bioinformatics. 2010 Jan 30;11:69. doi: 10.1186/1471-2105-11-69.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

利用机器学习技术实现基于脂肪酸甲酯（FAME）的大规模细菌物种鉴定

Towards large-scale FAME-based bacterial species identification using machine learning techniques.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献