Suppr超能文献

成分数据回归中的贝叶斯变量收缩与选择:在口腔微生物组中的应用

Bayesian Variable Shrinkage and Selection in Compositional Data Regression: Application to Oral Microbiome.

作者信息

Datta Jyotishka, Bandyopadhyay Dipankar

机构信息

Department of Statistics, Virginia Polytechnic Institute and State University, 250 Drillfield Drive, Blacksburg, VA 24061 USA.

Department of Biostatistics, School of Population Health, Virginia Commonwealth University, One Capital Square, 7th Floor, 830 East Main Street, PO Box 980032, Richmond, VA 23298-0032 USA.

出版信息

J Indian Soc Probab Stat. 2024;25(2):491-515. doi: 10.1007/s41096-024-00194-9. Epub 2024 May 29.

Abstract

Microbiome studies generate multivariate compositional responses, such as taxa counts, which are strictly non-negative, bounded, residing within a simplex, and subject to unit-sum constraint. In presence of covariates (which can be moderate to high dimensional), they are popularly modeled via the Dirichlet-Multinomial (D-M) regression framework. In this paper, we consider a Bayesian approach for estimation and inference under a D-M compositional framework, and present a comparative evaluation of some state-of-the-art continuous shrinkage priors for efficient variable selection to identify the most significant associations between available covariates, and taxonomic abundance. Specifically, we compare the performances of the horseshoe and horseshoe+ priors (with the benchmark Bayesian lasso), utilizing Hamiltonian Monte Carlo techniques for posterior sampling, and generating posterior credible intervals. Our simulation studies using synthetic data demonstrate excellent recovery and estimation accuracy of sparse parameter regime by the continuous shrinkage priors. We further illustrate our method via application to a motivating oral microbiome data generated from the NYC-Hanes study. RStan implementation of our method is made available at the GitHub link: (https://github.com/dattahub/compshrink).

摘要

微生物组研究产生多变量组成反应,如分类单元计数,这些反应严格非负、有界、位于单纯形内且受单位和约束。在存在协变量(可以是中度到高维)的情况下,它们通常通过狄利克雷 - 多项分布(D - M)回归框架进行建模。在本文中,我们考虑在D - M组成框架下进行估计和推断的贝叶斯方法,并对一些用于有效变量选择的最新连续收缩先验进行比较评估,以识别可用协变量与分类丰度之间最显著的关联。具体而言,我们比较了马蹄形和马蹄形 + 先验(以基准贝叶斯套索为对照)的性能,利用哈密顿蒙特卡罗技术进行后验采样并生成后验可信区间。我们使用合成数据进行的模拟研究表明,连续收缩先验在稀疏参数情况下具有出色的恢复和估计准确性。我们通过应用于从纽约市 - 汉尼斯研究生成的具有启发性的口腔微生物组数据进一步说明了我们的方法。我们方法的RStan实现可在GitHub链接获取:(https://github.com/dattahub/compshrink)

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/91ec/11470902/7a7f029305b8/41096_2024_194_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验