Ascari Roberto, Migliorati Sonia, Ongaro Andrea
Department of Economics, Management and Statistics (DEMS), University of Milano-Bicocca, Milano, Italy.
Stat Med. 2025 Aug;44(18-19):e70220. doi: 10.1002/sim.70220.
Motivated by the challenges in analyzing gut microbiome and metagenomic data, this paper introduces a novel mixture distribution for multivariate counts and a regression model built upon it. The flexibility and interpretability of the proposed distribution accommodate both negative and positive dependence among taxa and are accompanied by numerous theoretical properties, including explicit expressions for inter- and intraclass correlations, thereby providing a powerful tool for understanding complex microbiome interactions. Furthermore, the regression model based on this distribution facilitates the clear identification and interpretation of relationships between taxa and covariates by modeling the marginal mean of the multivariate response (i.e., taxa counts). Inference is performed using a tailored Hamiltonian Monte Carlo estimation method combined with a spike-and-slab variable selection procedure. Extensive simulation studies and an application to a human gut microbiome dataset highlight the proposed model's substantial improvements over competing models in terms of fit, interpretability, and predictive performance.
受分析肠道微生物组和宏基因组数据挑战的驱动,本文介绍了一种用于多变量计数的新型混合分布以及基于该分布构建的回归模型。所提出分布的灵活性和可解释性兼顾了分类群之间的负相关和正相关,并伴随着众多理论特性,包括类间和类内相关性的显式表达式,从而为理解复杂的微生物组相互作用提供了一个强大的工具。此外,基于此分布的回归模型通过对多变量响应(即分类群计数)的边际均值进行建模,有助于清晰地识别和解释分类群与协变量之间的关系。使用定制的哈密顿蒙特卡罗估计方法结合尖峰和平板变量选择程序进行推断。广泛的模拟研究以及对人类肠道微生物组数据集的应用突出了所提出模型在拟合、可解释性和预测性能方面相对于竞争模型的显著改进。
bioRxiv. 2024-3-27
Cochrane Database Syst Rev. 2025-7-10
Health Technol Assess. 2006-9
J Stat Softw. 2017
Comput Struct Biotechnol J. 2020-9-28
Biometrics. 2020-6
PLoS One. 2019-7-22
Nat Microbiol. 2018-12-13