McElhinney James M W R, Catacutan Mary Krystelle, Mawart Aurelie, Hasan Ayesha, Dias Jorge
Applied Genomics Laboratory, Center for Membranes and Advanced Water Technology, Khalifa University, Abu Dhabi, United Arab Emirates.
Department of Biomedical Engineering, Khalifa University, Abu Dhabi, United Arab Emirates.
Front Microbiol. 2022 Apr 25;13:851450. doi: 10.3389/fmicb.2022.851450. eCollection 2022.
Microbial communities are ubiquitous and carry an exceptionally broad metabolic capability. Upon environmental perturbation, microbes are also amongst the first natural responsive elements with perturbation-specific cues and markers. These communities are thereby uniquely positioned to inform on the status of environmental conditions. The advent of microbial omics has led to an unprecedented volume of complex microbiological data sets. Importantly, these data sets are rich in biological information with potential for predictive environmental classification and forecasting. However, the patterns in this information are often hidden amongst the inherent complexity of the data. There has been a continued rise in the development and adoption of machine learning (ML) and deep learning architectures for solving research challenges of this sort. Indeed, the interface between molecular microbial ecology and artificial intelligence (AI) appears to show considerable potential for significantly advancing environmental monitoring and management practices through their application. Here, we provide a primer for ML, highlight the notion of retaining biological sample information for supervised ML, discuss workflow considerations, and review the state of the art of the exciting, yet nascent, interdisciplinary field of ML-driven microbial ecology. Current limitations in this sphere of research are also addressed to frame a forward-looking perspective toward the realization of what we anticipate will become a pivotal toolkit for addressing environmental monitoring and management challenges in the years ahead.
微生物群落无处不在,具有极其广泛的代谢能力。在环境受到扰动时,微生物也是最早出现的具有特定扰动线索和标记的自然响应元素之一。因此,这些群落处于独特的地位,能够反映环境状况。微生物组学的出现带来了前所未有的大量复杂微生物数据集。重要的是,这些数据集富含生物信息,具有进行预测性环境分类和预测的潜力。然而,这些信息中的模式往往隐藏在数据固有的复杂性之中。为解决这类研究挑战,机器学习(ML)和深度学习架构的开发与应用持续增加。事实上,分子微生物生态学与人工智能(AI)之间的交叉领域似乎具有巨大潜力,通过它们的应用可显著推动环境监测和管理实践。在此,我们提供一份机器学习入门指南,强调在有监督机器学习中保留生物样本信息的概念,讨论工作流程注意事项,并回顾这个令人兴奋但尚处于起步阶段的机器学习驱动的微生物生态学跨学科领域的现状。我们还探讨了该研究领域当前的局限性,以构建一个前瞻性视角,展望我们预期在未来几年将成为应对环境监测和管理挑战关键工具的发展方向。