Department of Biostatistics, Epidemiology & Informatics, University of Pennsylvania, Philadelphia, PA, USA.
Department of Genetics, University of Pennsylvania, Philadelphia, PA, USA.
Nat Rev Genet. 2020 Aug;21(8):493-502. doi: 10.1038/s41576-020-0224-1. Epub 2020 Mar 31.
Accurate prediction of disease risk based on the genetic make-up of an individual is essential for effective prevention and personalized treatment. Nevertheless, to date, individual genetic variants from genome-wide association studies have achieved only moderate prediction of disease risk. The aggregation of genetic variants under a polygenic model shows promising improvements in prediction accuracies. Increasingly, electronic health records (EHRs) are being linked to patient genetic data in biobanks, which provides new opportunities for developing and applying polygenic risk scores in the clinic, to systematically examine and evaluate patient susceptibilities to disease. However, the heterogeneous nature of EHR data brings forth many practical challenges along every step of designing and implementing risk prediction strategies. In this Review, we present the unique considerations for using genotype and phenotype data from biobank-linked EHRs for polygenic risk prediction.
基于个体的基因构成准确预测疾病风险对于有效预防和个性化治疗至关重要。然而,迄今为止,全基因组关联研究中的个体遗传变异仅能实现对疾病风险的适度预测。在多基因模型下聚合遗传变异显示出在预测准确性方面有很大的提升。越来越多的电子健康记录 (EHRs) 正在与生物库中的患者遗传数据相关联,这为在临床中开发和应用多基因风险评分提供了新的机会,以系统地检查和评估患者对疾病的易感性。然而,EHR 数据的异质性在设计和实施风险预测策略的每一步都带来了许多实际挑战。在这篇综述中,我们介绍了使用来自与生物库关联的 EHR 的基因型和表型数据进行多基因风险预测的独特考虑因素。