Suppr超能文献

用于异构数据的协变量辅助贝叶斯图学习

Covariate-Assisted Bayesian Graph Learning for Heterogeneous Data.

作者信息

Niu Yabo, Ni Yang, Pati Debdeep, Mallick Bani K

机构信息

Department of Mathematics, University of Houston.

Department of Statistics, Texas A&M University.

出版信息

J Am Stat Assoc. 2024;119(547):1985-1999. doi: 10.1080/01621459.2023.2233744. Epub 2023 Sep 6.

Abstract

In a traditional Gaussian graphical model, data homogeneity is routinely assumed with no extra variables affecting the conditional independence. In modern genomic datasets, there is an abundance of auxiliary information, which often gets under-utilized in determining the joint dependency structure. In this article, we consider a Bayesian approach to model undirected graphs underlying heterogeneous multivariate observations with additional assistance from covariates. Building on product partition models, we propose a novel covariate-dependent Gaussian graphical model that allows graphs to vary with covariates so that observations whose covariates are similar share a similar undirected graph. To efficiently embed Gaussian graphical models into our proposed framework, we explore both Gaussian likelihood and pseudo-likelihood functions. For Gaussian likelihood, a G-Wishart distribution is used as a natural conjugate prior, and for the pseudo-likelihood, a product of Gaussianconditionals is used. Moreover, the proposed model has large prior support and is flexible to approximate any -Hölder conditional variance-covariance matrices with . We further show that based on the theory of fractional likelihood, the rate of posterior contraction is minimax optimal assuming the true density to be a Gaussian mixture with a known number of components. The efficacy of the approach is demonstrated via simulation studies and an analysis of a protein network for a breast cancer dataset assisted by mRNA gene expression as covariates.

摘要

在传统的高斯图形模型中,通常假定数据具有同质性,不存在影响条件独立性的额外变量。在现代基因组数据集中,存在大量辅助信息,而这些信息在确定联合依赖结构时往往未得到充分利用。在本文中,我们考虑一种贝叶斯方法,在协变量的额外辅助下,对异构多变量观测值背后的无向图进行建模。基于乘积划分模型,我们提出了一种新颖的依赖协变量的高斯图形模型,该模型允许图随协变量变化,使得协变量相似的观测值共享一个相似的无向图。为了将高斯图形模型有效地嵌入到我们提出的框架中,我们探索了高斯似然函数和伪似然函数。对于高斯似然,使用G-Wishart分布作为自然共轭先验,对于伪似然,使用高斯条件的乘积。此外,所提出的模型具有较大的先验支持,并且能够灵活地近似任何具有的-Hölder条件方差-协方差矩阵。我们进一步表明,基于分数似然理论,假设真实密度为具有已知成分数量的高斯混合模型,则后验收缩率是极小极大最优的。通过模拟研究以及对以mRNA基因表达作为协变量辅助的乳腺癌数据集的蛋白质网络分析,证明了该方法的有效性。

相似文献

1
Covariate-Assisted Bayesian Graph Learning for Heterogeneous Data.用于异构数据的协变量辅助贝叶斯图学习
J Am Stat Assoc. 2024;119(547):1985-1999. doi: 10.1080/01621459.2023.2233744. Epub 2023 Sep 6.
3
Bayesian graph selection consistency under model misspecification.模型误设下的贝叶斯图选择一致性
Bernoulli (Andover). 2021 Feb;27(1):637-672. doi: 10.3150/20-BEJ1253. Epub 2020 Nov 20.
5
Bayesian Graphical Regression.贝叶斯图形回归
J Am Stat Assoc. 2019;114(525):184-197. doi: 10.1080/01621459.2017.1389739. Epub 2018 Jun 28.
6
Bayesian Inference of Multiple Gaussian Graphical Models.多个高斯图形模型的贝叶斯推断
J Am Stat Assoc. 2015 Mar 1;110(509):159-174. doi: 10.1080/01621459.2014.896806.

本文引用的文献

1
Bayesian Graphical Regression.贝叶斯图形回归
J Am Stat Assoc. 2019;114(525):184-197. doi: 10.1080/01621459.2017.1389739. Epub 2018 Jun 28.
7
Sparse covariance estimation in heterogeneous samples.异质样本中的稀疏协方差估计
Electron J Stat. 2011;5:981-1014. doi: 10.1214/11-EJS634. Epub 2011 Sep 15.
9
Modeling Protein Expression and Protein Signaling Pathways.蛋白质表达与蛋白质信号通路建模
J Am Stat Assoc. 2011;107(500):1372-1384. doi: 10.1080/01621459.2012.706121.
10
DINGO: differential network analysis in genomics.DINGO:基因组学中的差异网络分析
Bioinformatics. 2015 Nov 1;31(21):3413-20. doi: 10.1093/bioinformatics/btv406. Epub 2015 Jul 6.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验