Suppr超能文献

一种用于高维回归的图约束估计的贝叶斯方法。

A Bayesian Approach for Graph-constrained Estimation for High-dimensional Regression.

作者信息

Sun Hokeun, Li Hongzhe

机构信息

Department of Biostatistics and Epidemiology, University of Pennsylvania, Philadelphia, PA 19104, USA.

出版信息

Int J Syst Synth Biol. 2010;1(2):255-272.

Abstract

Many different biological processes are represented by network graphs such as regulatory networks, metabolic pathways, and protein-protein interaction networks. Since genes that are linked on the networks usually have biologically similar functions, the linked genes form molecular modules to affect the clinical phenotypes/outcomes. Similarly, in large-scale genetic association studies, many SNPs are in high linkage disequilibrium (LD), which can also be summarized as a LD graph. In order to incorporate the graph information into regression analysis with high dimensional genomic data as predictors, we introduce a Bayesian approach for graph-constrained estimation (Bayesian GRACE) and regularization, which controls the amount of regularization for sparsity and smoothness of the regression coefficients. The Bayesian estimation with their posterior distributions can provide credible intervals for the estimates of the regression coefficients along with standard errors. The deviance information criterion (DIC) is applied for model assessment and tuning parameter selection. The performance of the proposed Bayesian approach is evaluated through simulation studies and is compared with Bayesian Lasso and Bayesian Elastic-net procedures. We demonstrate our method in an analysis of data from a case-control genome-wide association study of neuroblastoma using a weighted LD graph.

摘要

许多不同的生物学过程都可以用网络图来表示,如调控网络、代谢途径和蛋白质-蛋白质相互作用网络。由于网络上相连的基因通常具有生物学上相似的功能,这些相连的基因形成分子模块来影响临床表型/结果。同样,在大规模基因关联研究中,许多单核苷酸多态性(SNP)处于高度连锁不平衡(LD)状态,这也可以总结为一个LD图。为了将图信息纳入以高维基因组数据为预测变量的回归分析中,我们引入了一种用于图约束估计(贝叶斯GRACE)和正则化的贝叶斯方法,该方法控制回归系数稀疏性和平滑性的正则化量。带有后验分布的贝叶斯估计可以为回归系数的估计提供可信区间以及标准误差。偏差信息准则(DIC)用于模型评估和调优参数选择。通过模拟研究评估了所提出的贝叶斯方法的性能,并与贝叶斯套索法和贝叶斯弹性网法进行了比较。我们使用加权LD图,在一项神经母细胞瘤病例对照全基因组关联研究的数据中展示了我们的方法。

相似文献

3
NETWORK-REGULARIZED HIGH-DIMENSIONAL COX REGRESSION FOR ANALYSIS OF GENOMIC DATA.
Stat Sin. 2014 Jul;24(3):1433-1459. doi: 10.5705/ss.2012.317.
4
Network-constrained regularization and variable selection for analysis of genomic data.
Bioinformatics. 2008 May 1;24(9):1175-82. doi: 10.1093/bioinformatics/btn081. Epub 2008 Mar 1.
5
On Penalty Parameter Selection for Estimating Network Models.
Multivariate Behav Res. 2021 Mar-Apr;56(2):288-302. doi: 10.1080/00273171.2019.1672516. Epub 2019 Nov 1.
6
Bayesian hierarchical graph-structured model for pathway analysis using gene expression data.
Stat Appl Genet Mol Biol. 2013 Jun;12(3):393-412. doi: 10.1515/sagmb-2013-0011.
7
Molecular pathway identification using biological network-regularized logistic models.
BMC Genomics. 2013;14 Suppl 8(Suppl 8):S7. doi: 10.1186/1471-2164-14-S8-S7. Epub 2013 Dec 9.
8
Performance of a blockwise approach in variable selection using linkage disequilibrium information.
BMC Bioinformatics. 2015 May 8;16:148. doi: 10.1186/s12859-015-0556-6.
9
Sparse Regression Incorporating Graphical Structure among Predictors.
J Am Stat Assoc. 2016;111(514):707-720. doi: 10.1080/01621459.2015.1034319. Epub 2016 Aug 18.
10
Evaluation of the lasso and the elastic net in genome-wide association studies.
Front Genet. 2013 Dec 4;4:270. doi: 10.3389/fgene.2013.00270. eCollection 2013.

引用本文的文献

1
BAYESIAN LEARNING OF COVID-19 VACCINE SAFETY WHILE INCORPORATING ADVERSE EVENTS ONTOLOGY.
Ann Appl Stat. 2023 Dec;17(4):2887-2902. doi: 10.1214/23-aoas1743. Epub 2023 Oct 30.

本文引用的文献

2
Deviance Information Criterion (DIC) in Bayesian Multiple QTL Mapping.
Comput Stat Data Anal. 2009 Mar 15;53(5):1850-1860. doi: 10.1016/j.csda.2008.01.016.
3
A hidden Markov random field model for genome-wide association studies.
Biostatistics. 2010 Jan;11(1):139-50. doi: 10.1093/biostatistics/kxp043. Epub 2009 Oct 12.
4
HIGH DIMENSIONAL VARIABLE SELECTION.
Ann Stat. 2009 Jan 1;37(5A):2178-2201. doi: 10.1214/08-aos646.
5
Identification of ALK as a major familial neuroblastoma predisposition gene.
Nature. 2008 Oct 16;455(7215):930-5. doi: 10.1038/nature07261. Epub 2008 Aug 24.
6
Bayesian LASSO for quantitative trait loci mapping.
Genetics. 2008 Jun;179(2):1045-55. doi: 10.1534/genetics.107.085589. Epub 2008 May 27.
7
Chromosome 6p22 locus associated with clinically aggressive neuroblastoma.
N Engl J Med. 2008 Jun 12;358(24):2585-93. doi: 10.1056/NEJMoa0708698. Epub 2008 May 7.
8
Network-constrained regularization and variable selection for analysis of genomic data.
Bioinformatics. 2008 May 1;24(9):1175-82. doi: 10.1093/bioinformatics/btn081. Epub 2008 Mar 1.
9
Empirical Bayes Gibbs sampling.
Biostatistics. 2001 Dec;2(4):485-500. doi: 10.1093/biostatistics/2.4.485.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验