Wang Xu-Wen, Sun Zheng, Jia Huijue, Michel-Mata Sebastian, Angulo Marco Tulio, Dai Lei, He Xuesong, Weiss Scott T, Liu Yang-Yu
bioRxiv. 2023 Mar 31:2023.03.15.532858. doi: 10.1101/2023.03.15.532858.
Previous studies suggested that microbial communities harbor keystone species whose removal can cause a dramatic shift in microbiome structure and functioning. Yet, an efficient method to systematically identify keystone species in microbial communities is still lacking. This is mainly due to our limited knowledge of microbial dynamics and the experimental and ethical difficulties of manipulating microbial communities. Here, we propose a Data-driven Keystone species Identification (DKI) framework based on deep learning to resolve this challenge. Our key idea is to implicitly learn the assembly rules of microbial communities from a particular habitat by training a deep learning model using microbiome samples collected from this habitat. The well-trained deep learning model enables us to quantify the community-specific keystoneness of each species in any microbiome sample from this habitat by conducting a thought experiment on species removal. We systematically validated this DKI framework using synthetic data generated from a classical population dynamics model in community ecology. We then applied DKI to analyze human gut, oral microbiome, soil, and coral microbiome data. We found that those taxa with high median keystoneness across different communities display strong community specificity, and many of them have been reported as keystone taxa in literature. The presented DKI framework demonstrates the power of machine learning in tackling a fundamental problem in community ecology, paving the way for the data-driven management of complex microbial communities.
先前的研究表明,微生物群落中存在关键物种,去除这些物种会导致微生物组结构和功能发生显著变化。然而,目前仍缺乏一种系统识别微生物群落中关键物种的有效方法。这主要是由于我们对微生物动态的了解有限,以及操纵微生物群落存在实验和伦理方面的困难。在此,我们提出了一种基于深度学习的数据驱动关键物种识别(DKI)框架来应对这一挑战。我们的核心思想是通过使用从特定栖息地收集的微生物组样本训练深度学习模型,隐式地学习该栖息地微生物群落的组装规则。训练有素的深度学习模型使我们能够通过对物种去除进行思想实验,量化来自该栖息地的任何微生物组样本中每个物种特定于群落的关键程度。我们使用群落生态学中经典种群动态模型生成的合成数据对该DKI框架框架进行了系统验证。然后,我们将DKI应用于分析人类肠道、口腔微生物组、土壤和珊瑚微生物组数据。我们发现,在不同群落中具有高中位数关键程度的那些分类群表现出很强的群落特异性,其中许多在文献中已被报道为关键分类群。所提出的DKI框架展示了机器学习在解决群落生态学中一个基本问题方面的力量,为复杂微生物群落的数据驱动管理铺平了道路。