Department of Cognitive Science, University of California, San Diego, 9500 Gilman Drive, La Jolla, CA, 92093, USA.
Center for Human Development, University of California, San Diego, 9500 Gilman Drive, La Jolla, CA, 92161, USA.
Behav Genet. 2023 May;53(3):292-309. doi: 10.1007/s10519-023-10139-w. Epub 2023 Apr 5.
Using individuals' genetic data researchers can generate Polygenic Scores (PS) that are able to predict risk for diseases, variability in different behaviors as well as anthropomorphic measures. This is achieved by leveraging models learned from previously published large Genome-Wide Association Studies (GWASs) associating locations in the genome with a phenotype of interest. Previous GWASs have predominantly been performed in European ancestry individuals. This is of concern as PS generated in samples with a different ancestry to the original training GWAS have been shown to have lower performance and limited portability, and many efforts are now underway to collect genetic databases on individuals of diverse ancestries. In this study, we compare multiple methods of generating PS, including pruning and thresholding and Bayesian continuous shrinkage models, to determine which of them is best able to overcome these limitations. To do this we use the ABCD Study, a longitudinal cohort with deep phenotyping on individuals of diverse ancestry. We generate PS for anthropometric and psychiatric phenotypes using previously published GWAS summary statistics and examine their performance in three subsamples of ABCD: African ancestry individuals (n = 811), European ancestry Individuals (n = 6703), and admixed ancestry individuals (n = 3664). We find that the single ancestry continuous shrinkage method, PRScs (CS), and the multi ancestry meta method, PRScsx Meta (CSx Meta), show the best performance across ancestries and phenotypes.
利用个体的遗传数据,研究人员可以生成多基因评分(PS),这些评分能够预测疾病风险、不同行为的变异性以及人体测量指标。这是通过利用从先前发表的与感兴趣的表型相关的全基因组关联研究(GWAS)中学习到的模型来实现的。先前的 GWAS 主要在欧洲血统个体中进行。这令人担忧,因为在与原始训练 GWAS 不同血统的样本中生成的 PS 表现较差,可移植性有限,现在许多努力正在进行中,以收集不同血统个体的遗传数据库。在这项研究中,我们比较了生成 PS 的多种方法,包括修剪和阈值以及贝叶斯连续收缩模型,以确定哪种方法最能克服这些限制。为此,我们使用了 ABCD 研究,这是一个具有不同血统个体深度表型的纵向队列。我们使用先前发表的 GWAS 汇总统计数据为人体测量和精神科表型生成 PS,并在 ABCD 的三个子样本中检查它们的性能:非洲裔个体(n=811)、欧洲裔个体(n=6703)和混合血统个体(n=3664)。我们发现,单一血统连续收缩方法 PRScs(CS)和多血统元方法 PRScsx Meta(CSx Meta)在不同血统和表型中表现最佳。