Guangzhou HKUST Fok Ying Tung Research Institute, Guangzhou 511458, China; Department of Mathematics, The Hong Kong University of Science and Technology, Hong Kong SAR, China.
Shenzhen Research Institute of Big Data, Shenzhen 518172, China.
Am J Hum Genet. 2021 Apr 1;108(4):632-655. doi: 10.1016/j.ajhg.2021.03.002. Epub 2021 Mar 25.
The development of polygenic risk scores (PRSs) has proved useful to stratify the general European population into different risk groups. However, PRSs are less accurate in non-European populations due to genetic differences across different populations. To improve the prediction accuracy in non-European populations, we propose a cross-population analysis framework for PRS construction with both individual-level (XPA) and summary-level (XPASS) GWAS data. By leveraging trans-ancestry genetic correlation, our methods can borrow information from the Biobank-scale European population data to improve risk prediction in the non-European populations. Our framework can also incorporate population-specific effects to further improve construction of PRS. With innovations in data structure and algorithm design, our methods provide a substantial saving in computational time and memory usage. Through comprehensive simulation studies, we show that our framework provides accurate, efficient, and robust PRS construction across a range of genetic architectures. In a Chinese cohort, our methods achieved 7.3%-198.0% accuracy gain for height and 19.5%-313.3% accuracy gain for body mass index (BMI) in terms of predictive R compared to existing PRS approaches. We also show that XPA and XPASS can achieve substantial improvement for construction of height PRSs in the African population, suggesting the generality of our framework across global populations.
多基因风险评分 (PRS) 的发展已被证明有助于将一般的欧洲人群分层为不同的风险组。然而,由于不同人群之间存在遗传差异,PRS 在非欧洲人群中的准确性较低。为了提高非欧洲人群的预测准确性,我们提出了一种基于个体水平(XPA)和汇总水平(XPASS)GWAS 数据的跨人群 PRS 构建分析框架。通过利用跨种族遗传相关性,我们的方法可以从欧洲人群的生物银行规模数据中获取信息,以提高非欧洲人群的风险预测能力。我们的框架还可以纳入人群特异性效应,以进一步改进 PRS 的构建。通过创新的数据结构和算法设计,我们的方法在计算时间和内存使用方面提供了大量节省。通过全面的模拟研究,我们表明我们的框架在一系列遗传结构下提供了准确、高效和稳健的 PRS 构建。在一个中国队列中,与现有的 PRS 方法相比,我们的方法在身高方面的预测 R 方面实现了 7.3%-198.0%的准确性增益,在体重指数 (BMI) 方面实现了 19.5%-313.3%的准确性增益。我们还表明,XPA 和 XPASS 可以在非洲人群中构建身高 PRS 方面取得实质性的改进,这表明我们的框架在全球人群中具有普遍性。