Mishra Aditya, McNichol Jesse, Fuhrman Jed, Blei David, Müller Christian L
Department of Statistics, University of Georgia, Athens, GA, 30606, United States.
Department of Biology, St. Francis Xavier University, Antigonish, NS, B2G 2W5, Canada.
ISME Commun. 2025 May 2;5(1):ycaf062. doi: 10.1093/ismeco/ycaf062. eCollection 2025 Jan.
Linking sequence-derived microbial taxa abundances to host (patho-)physiology or habitat characteristics in a reproducible and interpretable manner has remained a formidable challenge for the analysis of microbiome survey data. Here, we introduce a flexible probabilistic modeling framework, VI-MIDAS (variational inference for microbiome survey data analysis), that enables joint estimation of context-dependent drivers and broad patterns of associations of microbial taxon abundances from microbiome survey data. VI-MIDAS comprises mechanisms for direct coupling of taxon abundances with covariates and taxa-specific latent coupling, which can incorporate spatio-temporal information and taxon-taxon interactions. We leverage mean-field variational inference for posterior VI-MIDAS model parameter estimation and illustrate model building and analysis using Tara Ocean Expedition survey data. Using VI-MIDAS' latent embedding model and tools from network analysis, we show that marine microbial communities can be broadly categorized into five modules, including SAR11-, nitrosopumilus-, and alteromondales-dominated communities, each associated with specific environmental and spatiotemporal signatures. VI-MIDAS also finds evidence for largely positive taxon-taxon associations in SAR11 or Rhodospirillales clades, and negative associations with Alteromonadales and Flavobacteriales classes. Our results indicate that VI-MIDAS provides a powerful integrative statistical analysis framework for discovering broad patterns of associations between microbial taxa and context-specific covariate data from microbiome survey data.
以可重复且可解释的方式将基于序列的微生物分类群丰度与宿主(病理)生理学或栖息地特征联系起来,对于微生物组调查数据分析而言,仍然是一项艰巨的挑战。在此,我们引入了一个灵活的概率建模框架VI-MIDAS(用于微生物组调查数据分析的变分推断),它能够联合估计来自微生物组调查数据的上下文相关驱动因素以及微生物分类群丰度的广泛关联模式。VI-MIDAS包括将分类群丰度与协变量直接耦合的机制以及特定分类群的潜在耦合,后者可以纳入时空信息和分类群-分类群相互作用。我们利用平均场变分推断来进行VI-MIDAS模型参数的后验估计,并使用塔拉海洋探险调查数据来说明模型构建和分析。使用VI-MIDAS的潜在嵌入模型和网络分析工具,我们表明海洋微生物群落可大致分为五个模块,包括以SAR11、亚硝化侏儒菌和交替单胞菌为主导的群落,每个群落都与特定的环境和时空特征相关。VI-MIDAS还发现了SAR11或红螺菌目分支中分类群-分类群之间在很大程度上呈正相关的证据,以及与交替单胞菌目和黄杆菌纲呈负相关的证据。我们的结果表明,VI-MIDAS为从微生物组调查数据中发现微生物分类群与特定上下文协变量数据之间的广泛关联模式提供了一个强大的综合统计分析框架。