Suppr超能文献

将生物学知识纳入用于SNP效应联合估计的贝叶斯收缩模型。

Inclusion of biological knowledge in a Bayesian shrinkage model for joint estimation of SNP effects.

作者信息

Pereira Miguel, Thompson John R, Weichenberger Christian X, Thomas Duncan C, Minelli Cosetta

机构信息

National Heart and Lung Institute, Imperial College London, London, United Kingdom.

Department of Health Sciences, University of Leicester, Leicester, United Kingdom.

出版信息

Genet Epidemiol. 2017 May;41(4):320-331. doi: 10.1002/gepi.22038. Epub 2017 Apr 10.

Abstract

With the aim of improving detection of novel single-nucleotide polymorphisms (SNPs) in genetic association studies, we propose a method of including prior biological information in a Bayesian shrinkage model that jointly estimates SNP effects. We assume that the SNP effects follow a normal distribution centered at zero with variance controlled by a shrinkage hyperparameter. We use biological information to define the amount of shrinkage applied on the SNP effects distribution, so that the effects of SNPs with more biological support are less shrunk toward zero, thus being more likely detected. The performance of the method was tested in a simulation study (1,000 datasets, 500 subjects with ∼200 SNPs in 10 linkage disequilibrium (LD) blocks) using a continuous and a binary outcome. It was further tested in an empirical example on body mass index (continuous) and overweight (binary) in a dataset of 1,829 subjects and 2,614 SNPs from 30 blocks. Biological knowledge was retrieved using the bioinformatics tool Dintor, which queried various databases. The joint Bayesian model with inclusion of prior information outperformed the standard analysis: in the simulation study, the mean ranking of the true LD block was 2.8 for the Bayesian model versus 3.6 for the standard analysis of individual SNPs; in the empirical example, the mean ranking of the six true blocks was 8.5 versus 9.3 in the standard analysis. These results suggest that our method is more powerful than the standard analysis. We expect its performance to improve further as more biological information about SNPs becomes available.

摘要

为了提高基因关联研究中新型单核苷酸多态性(SNP)的检测能力,我们提出了一种在贝叶斯收缩模型中纳入先验生物学信息的方法,该模型可联合估计SNP效应。我们假设SNP效应服从以零为中心的正态分布,其方差由收缩超参数控制。我们利用生物学信息来定义应用于SNP效应分布的收缩量,这样,获得更多生物学支持的SNP效应向零收缩的程度较小,因此更有可能被检测到。在一项模拟研究(1000个数据集,500名受试者,10个连锁不平衡(LD)区域中有约200个SNP)中,使用连续型和二分类结局对该方法的性能进行了测试。在一个包含1829名受试者和来自30个区域的2614个SNP的数据集上,以体重指数(连续型)和超重(二分类)为例进行了实证检验。使用生物信息学工具Dintor检索生物学知识,该工具查询了各种数据库。纳入先验信息的联合贝叶斯模型优于标准分析:在模拟研究中,对于贝叶斯模型,真实LD区域的平均排名为2.8,而对单个SNP进行标准分析时为3.6;在实证检验中,六个真实区域的平均排名在标准分析中为9.3,而在贝叶斯模型中为8.5。这些结果表明,我们的方法比标准分析更具效力。我们预计,随着更多关于SNP的生物学信息可用,其性能将进一步提高。

相似文献

1
Inclusion of biological knowledge in a Bayesian shrinkage model for joint estimation of SNP effects.
Genet Epidemiol. 2017 May;41(4):320-331. doi: 10.1002/gepi.22038. Epub 2017 Apr 10.
4
Bayesian estimates of linkage disequilibrium.
BMC Genet. 2007 Jun 25;8:36. doi: 10.1186/1471-2156-8-36.
5
Where is the causal variant? On the advantage of the family design over the case-control design in genetic association studies.
Eur J Hum Genet. 2015 Oct;23(10):1357-63. doi: 10.1038/ejhg.2014.284. Epub 2015 Jan 14.
6
Mixture SNPs effect on phenotype in genome-wide association studies.
BMC Genomics. 2015 Feb 3;16(1):3. doi: 10.1186/1471-2164-16-3.
7
Iterative sure independence screening EM-Bayesian LASSO algorithm for multi-locus genome-wide association studies.
PLoS Comput Biol. 2017 Jan 31;13(1):e1005357. doi: 10.1371/journal.pcbi.1005357. eCollection 2017 Jan.
8
Testing SNPs and sets of SNPs for importance in association studies.
Biostatistics. 2011 Jan;12(1):18-32. doi: 10.1093/biostatistics/kxq042. Epub 2010 Jul 2.
9
Linkage disequilibrium assessment via log-linear modeling of SNP haplotype frequencies.
Genet Epidemiol. 2003 Sep;25(2):106-14. doi: 10.1002/gepi.10254.

引用本文的文献

1
Polygenic Risk Scores for Prediction of Breast Cancer and Breast Cancer Subtypes.
Am J Hum Genet. 2019 Jan 3;104(1):21-34. doi: 10.1016/j.ajhg.2018.11.002. Epub 2018 Dec 13.

本文引用的文献

1
eQuIPS: eQTL Analysis Using Informed Partitioning of SNPs - A Fully Bayesian Approach.
Genet Epidemiol. 2016 May;40(4):273-83. doi: 10.1002/gepi.21961. Epub 2016 Mar 14.
2
Incorporating Functional Genomic Information in Genetic Association Studies Using an Empirical Bayes Approach.
Genet Epidemiol. 2016 Apr;40(3):176-87. doi: 10.1002/gepi.21956. Epub 2016 Feb 1.
3
Dintor: functional annotation of genomic and proteomic data.
BMC Genomics. 2015 Dec 21;16:1081. doi: 10.1186/s12864-015-2279-5.
4
Genetic studies of body mass index yield new insights for obesity biology.
Nature. 2015 Feb 12;518(7538):197-206. doi: 10.1038/nature14177.
5
The Mouse Genome Database (MGD): facilitating mouse as a model for human biology and disease.
Nucleic Acids Res. 2015 Jan;43(Database issue):D726-36. doi: 10.1093/nar/gku967. Epub 2014 Oct 27.
6
Efficient haplotype block recognition of very long and dense genetic sequences.
BMC Bioinformatics. 2014 Jan 14;15:10. doi: 10.1186/1471-2105-15-10.
7
Pfam: the protein families database.
Nucleic Acids Res. 2014 Jan;42(Database issue):D222-30. doi: 10.1093/nar/gkt1223. Epub 2013 Nov 27.
8
The Reactome pathway knowledgebase.
Nucleic Acids Res. 2014 Jan;42(Database issue):D472-7. doi: 10.1093/nar/gkt1102. Epub 2013 Nov 15.
9
Annotating cancer variants and anti-cancer therapeutics in reactome.
Cancers (Basel). 2012 Nov 8;4(4):1180-211. doi: 10.3390/cancers4041180.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验