Hu Yiming, Lu Qiongshi, Powles Ryan, Yao Xinwei, Yang Can, Fang Fang, Xu Xinran, Zhao Hongyu
Department of Biostatistics, Yale School of Public Health, New Haven, CT, United States of America.
Program of Computational Biology and Bioinformatics, Yale University, New Haven, CT, United States of America.
PLoS Comput Biol. 2017 Jun 8;13(6):e1005589. doi: 10.1371/journal.pcbi.1005589. eCollection 2017 Jun.
Genetic risk prediction is an important goal in human genetics research and precision medicine. Accurate prediction models will have great impacts on both disease prevention and early treatment strategies. Despite the identification of thousands of disease-associated genetic variants through genome wide association studies (GWAS), genetic risk prediction accuracy remains moderate for most diseases, which is largely due to the challenges in both identifying all the functionally relevant variants and accurately estimating their effect sizes in the presence of linkage disequilibrium. In this paper, we introduce AnnoPred, a principled framework that leverages diverse types of genomic and epigenomic functional annotations in genetic risk prediction for complex diseases. AnnoPred is trained using GWAS summary statistics in a Bayesian framework in which we explicitly model various functional annotations and allow for linkage disequilibrium estimated from reference genotype data. Compared with state-of-the-art risk prediction methods, AnnoPred achieves consistently improved prediction accuracy in both extensive simulations and real data.
遗传风险预测是人类遗传学研究和精准医学的一个重要目标。准确的预测模型将对疾病预防和早期治疗策略产生重大影响。尽管通过全基因组关联研究(GWAS)已鉴定出数千种与疾病相关的遗传变异,但对于大多数疾病而言,遗传风险预测的准确性仍然适中,这在很大程度上是由于在识别所有功能相关变异以及在连锁不平衡存在的情况下准确估计其效应大小方面都存在挑战。在本文中,我们介绍了AnnoPred,这是一个有原则的框架,它在复杂疾病的遗传风险预测中利用了多种类型的基因组和表观基因组功能注释。AnnoPred在贝叶斯框架中使用GWAS汇总统计数据进行训练,在该框架中我们明确地对各种功能注释进行建模,并允许根据参考基因型数据估计连锁不平衡。与最先进的风险预测方法相比,AnnoPred在广泛的模拟和真实数据中均实现了持续提高的预测准确性。