成分数据回归中的贝叶斯变量收缩与选择：在口腔微生物组中的应用

Bayesian Variable Shrinkage and Selection in Compositional Data Regression: Application to Oral Microbiome.

作者信息

Datta Jyotishka, Bandyopadhyay Dipankar

机构信息

Department of Statistics, Virginia Polytechnic Institute and State University, 250 Drillfield Drive, Blacksburg, VA 24061 USA.

Department of Biostatistics, School of Population Health, Virginia Commonwealth University, One Capital Square, 7th Floor, 830 East Main Street, PO Box 980032, Richmond, VA 23298-0032 USA.

出版信息

J Indian Soc Probab Stat. 2024;25(2):491-515. doi: 10.1007/s41096-024-00194-9. Epub 2024 May 29.

DOI:10.1007/s41096-024-00194-9

PMID:39403125

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11470902/

Abstract

Microbiome studies generate multivariate compositional responses, such as taxa counts, which are strictly non-negative, bounded, residing within a simplex, and subject to unit-sum constraint. In presence of covariates (which can be moderate to high dimensional), they are popularly modeled via the Dirichlet-Multinomial (D-M) regression framework. In this paper, we consider a Bayesian approach for estimation and inference under a D-M compositional framework, and present a comparative evaluation of some state-of-the-art continuous shrinkage priors for efficient variable selection to identify the most significant associations between available covariates, and taxonomic abundance. Specifically, we compare the performances of the horseshoe and horseshoe+ priors (with the benchmark Bayesian lasso), utilizing Hamiltonian Monte Carlo techniques for posterior sampling, and generating posterior credible intervals. Our simulation studies using synthetic data demonstrate excellent recovery and estimation accuracy of sparse parameter regime by the continuous shrinkage priors. We further illustrate our method via application to a motivating oral microbiome data generated from the NYC-Hanes study. RStan implementation of our method is made available at the GitHub link: (https://github.com/dattahub/compshrink).

摘要

微生物组研究产生多变量组成反应，如分类单元计数，这些反应严格非负、有界、位于单纯形内且受单位和约束。在存在协变量（可以是中度到高维）的情况下，它们通常通过狄利克雷 - 多项分布（D - M）回归框架进行建模。在本文中，我们考虑在D - M组成框架下进行估计和推断的贝叶斯方法，并对一些用于有效变量选择的最新连续收缩先验进行比较评估，以识别可用协变量与分类丰度之间最显著的关联。具体而言，我们比较了马蹄形和马蹄形 + 先验（以基准贝叶斯套索为对照）的性能，利用哈密顿蒙特卡罗技术进行后验采样并生成后验可信区间。我们使用合成数据进行的模拟研究表明，连续收缩先验在稀疏参数情况下具有出色的恢复和估计准确性。我们通过应用于从纽约市 - 汉尼斯研究生成的具有启发性的口腔微生物组数据进一步说明了我们的方法。我们方法的RStan实现可在GitHub链接获取：(https://github.com/dattahub/compshrink)

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/91ec/11470902/7a7f029305b8/41096_2024_194_Fig1_HTML.jpg

相似文献

Bayesian Variable Shrinkage and Selection in Compositional Data Regression: Application to Oral Microbiome.成分数据回归中的贝叶斯变量收缩与选择：在口腔微生物组中的应用

J Indian Soc Probab Stat. 2024;25(2):491-515. doi: 10.1007/s41096-024-00194-9. Epub 2024 May 29.

Applications of Bayesian shrinkage prior models in clinical research with categorical responses.贝叶斯收缩先验模型在分类反应临床研究中的应用。

BMC Med Res Methodol. 2022 Apr 28;22(1):126. doi: 10.1186/s12874-022-01560-6.

An integrative Bayesian Dirichlet-multinomial regression model for the analysis of taxonomic abundances in microbiome data.一种用于分析微生物组数据中分类丰度的综合贝叶斯狄利克雷多项回归模型。

BMC Bioinformatics. 2017 Feb 8;18(1):94. doi: 10.1186/s12859-017-1516-0.

Bayesian compositional models for ordinal response.贝叶斯有序响应组合模型。

Stat Methods Med Res. 2024 Jun;33(6):1043-1054. doi: 10.1177/09622802241247730. Epub 2024 Apr 23.

Bayesian compositional generalized linear models for analyzing microbiome data.贝叶斯组合广义线性模型在微生物组数据分析中的应用。

Stat Med. 2024 Jan 15;43(1):141-155. doi: 10.1002/sim.9946. Epub 2023 Nov 20.

Generalized cumulative shrinkage process priors with applications to sparse Bayesian factor analysis.广义累积收缩先验及其在稀疏贝叶斯因子分析中的应用。

Philos Trans A Math Phys Eng Sci. 2023 May 15;381(2247):20220148. doi: 10.1098/rsta.2022.0148. Epub 2023 Mar 27.

Latent Network Estimation and Variable Selection for Compositional Data Via Variational EM.基于变分期望最大化算法的成分数据潜在网络估计与变量选择

J Comput Graph Stat. 2022;31(1):163-175. doi: 10.1080/10618600.2021.1935971. Epub 2021 Jul 19.

Bayesian compositional regression with structured priors for microbiome feature selection.基于结构先验的贝叶斯组合回归在微生物组特征选择中的应用。

Biometrics. 2021 Sep;77(3):824-838. doi: 10.1111/biom.13335. Epub 2020 Jul 31.

Bayesian compositional regression with microbiome features via variational inference.基于变分推断的微生物组特征贝叶斯组合回归。

BMC Bioinformatics. 2023 May 22;24(1):210. doi: 10.1186/s12859-023-05219-x.

Transformation and differential abundance analysis of microbiome data incorporating phylogeny.整合系统发育信息的微生物组数据的转化和差异丰度分析。

Bioinformatics. 2021 Dec 11;37(24):4652-4660. doi: 10.1093/bioinformatics/btab543.

本文引用的文献

The human microbiome and cancer: a diagnostic and therapeutic perspective.人类微生物组与癌症：诊断与治疗视角。

Cancer Biol Ther. 2023 Dec 31;24(1):2240084. doi: 10.1080/15384047.2023.2240084.

Stan: A Probabilistic Programming Language.斯坦：一种概率编程语言。

J Stat Softw. 2017;76. doi: 10.18637/jss.v076.i01. Epub 2017 Jan 11.

Impact of Oral Microbiome in Periodontal Health and Periodontitis: A Critical Review on Prevention and Treatment.口腔微生物组在牙周健康和牙周炎中的作用：预防和治疗的关键综述。

Int J Mol Sci. 2022 May 5;23(9):5142. doi: 10.3390/ijms23095142.

Sociodemographic variation in the oral microbiome.口腔微生物组的社会人口统计学变异。

Ann Epidemiol. 2019 Jul;35:73-80.e2. doi: 10.1016/j.annepidem.2019.03.006. Epub 2019 May 8.

Tobacco exposure associated with oral microbiota oxygen utilization in the New York City Health and Nutrition Examination Study.烟草暴露与纽约市健康与营养调查研究中口腔微生物群的氧气利用有关。

Ann Epidemiol. 2019 Jun;34:18-25.e3. doi: 10.1016/j.annepidem.2019.03.005. Epub 2019 Mar 28.

Zero-inflated generalized Dirichlet multinomial regression model for microbiome compositional data analysis.用于微生物组组成数据分析的零膨胀广义狄利克雷多项回归模型。

Biostatistics. 2019 Oct 1;20(4):698-713. doi: 10.1093/biostatistics/kxy025.

The microbiome in autoimmune diseases.自身免疫性疾病中的微生物组。

Clin Exp Immunol. 2019 Jan;195(1):74-85. doi: 10.1111/cei.13158.

Microbiome Datasets Are Compositional: And This Is Not Optional.微生物组数据集具有构成性：这并非可有可无。

Front Microbiol. 2017 Nov 15;8:2224. doi: 10.3389/fmicb.2017.02224. eCollection 2017.

Fast sampling with Gaussian scale-mixture priors in high-dimensional regression.高维回归中具有高斯尺度混合先验的快速采样

Biometrika. 2016 Dec;103(4):985-991. doi: 10.1093/biomet/asw042. Epub 2016 Oct 27.

BMC Bioinformatics. 2017 Feb 8;18(1):94. doi: 10.1186/s12859-017-1516-0.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

成分数据回归中的贝叶斯变量收缩与选择：在口腔微生物组中的应用

Bayesian Variable Shrinkage and Selection in Compositional Data Regression: Application to Oral Microbiome.

作者信息

机构信息

出版信息

相似文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

本文引用的文献