Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA, USA; Broad Institute of MIT and Harvard, Cambridge, MA, USA.
Am J Hum Genet. 2023 Nov 2;110(11):1888-1902. doi: 10.1016/j.ajhg.2023.09.013. Epub 2023 Oct 27.
Admixed individuals offer unique opportunities for addressing limited transferability in polygenic scores (PGSs), given the substantial trans-ancestry genetic correlation in many complex traits. However, they are rarely considered in PGS training, given the challenges in representing ancestry-matched linkage-disequilibrium reference panels for admixed individuals. Here we present inclusive PGS (iPGS), which captures ancestry-shared genetic effects by finding the exact solution for penalized regression on individual-level data and is thus naturally applicable to admixed individuals. We validate our approach in a simulation study across 33 configurations with varying heritability, polygenicity, and ancestry composition in the training set. When iPGS is applied to n = 237,055 ancestry-diverse individuals in the UK Biobank, it shows the greatest improvements in Africans by 48.9% on average across 60 quantitative traits and up to 50-fold improvements for some traits (neutrophil count, R = 0.058) over the baseline model trained on the same number of European individuals. When we allowed iPGS to use n = 284,661 individuals, we observed an average improvement of 60.8% for African, 11.6% for South Asian, 7.3% for non-British White, 4.8% for White British, and 17.8% for the other individuals. We further developed iPGS+refit to jointly model the ancestry-shared and -dependent genetic effects when heterogeneous genetic associations were present. For neutrophil count, for example, iPGS+refit showed the highest predictive performance in the African group (R = 0.115), which exceeds the best predictive performance for the White British group (R = 0.090 in the iPGS model), even though only 1.49% of individuals used in the iPGS training are of African ancestry. Our results indicate the power of including diverse individuals for developing more equitable PGS models.
混合个体为解决多基因评分(PGS)中转移能力有限的问题提供了独特的机会,因为在许多复杂性状中存在大量的跨祖先遗传相关性。然而,由于为混合个体表示匹配祖先的连锁不平衡参考面板存在挑战,因此在 PGS 训练中很少考虑它们。在这里,我们提出了包容性 PGS(iPGS),它通过在个体水平数据上找到惩罚回归的精确解来捕捉祖先共享的遗传效应,因此自然适用于混合个体。我们在 33 种不同配置的模拟研究中验证了我们的方法,这些配置在训练集中具有不同的遗传率、多效性和祖先组成。当 iPGS 应用于 UK Biobank 中 237055 名具有不同祖先的个体时,它在 60 个定量性状上平均使非洲人的表现提高了 48.9%,对于某些性状(中性粒细胞计数,R=0.058),相对于在相同数量的欧洲个体上训练的基线模型,提高了 50 倍。当我们允许 iPGS 使用 284661 名个体时,我们观察到非洲人的平均提高了 60.8%,南亚人的提高了 11.6%,非英国白人的提高了 7.3%,英国白人的提高了 4.8%,其他个体的提高了 17.8%。我们进一步开发了 iPGS+refit,以在存在异质遗传关联时联合建模祖先共享和依赖的遗传效应。例如,对于中性粒细胞计数,iPGS+refit 在非洲群体中表现出最高的预测性能(R=0.115),超过了英国白人组(iPGS 模型中 R=0.090)的最佳预测性能,尽管仅使用了 1.49%的非洲裔个体用于 iPGS 训练。我们的结果表明,为了开发更公平的 PGS 模型,纳入多样化个体的力量。