Center for Genetic Epidemiology, Department of Population and Public Health Sciences, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA.
Division of Epidemiology and Biostatistics, School of Public Health, University of California, Berkeley, Berkeley, CA, USA.
HGG Adv. 2023 Oct 12;4(4):100239. doi: 10.1016/j.xhgg.2023.100239. Epub 2023 Sep 14.
The utility of polygenic risk score (PRS) models has not been comprehensively evaluated for childhood acute lymphoblastic leukemia (ALL), the most common type of cancer in children. Previous PRS models for ALL were based on significant loci observed in genome-wide association studies (GWASs), even though genomic PRS models have been shown to improve prediction performance for a number of complex diseases. In the United States, Latino (LAT) children have the highest risk of ALL, but the transferability of PRS models to LAT children has not been studied. In this study, we constructed and evaluated genomic PRS models based on either non-Latino White (NLW) GWAS or a multi-ancestry GWAS. We found that the best PRS models performed similarly between held-out NLW and LAT samples (PseudoR = 0.086 ± 0.023 in NLW vs. 0.060 ± 0.020 in LAT), and can be improved for LAT if we performed GWAS in LAT-only (PseudoR = 0.116 ± 0.026) or multi-ancestry samples (PseudoR = 0.131 ± 0.025). However, the best genomic models currently do not have better prediction accuracy than a conventional model using all known ALL-associated loci in the literature (PseudoR = 0.166 ± 0.025), which includes loci from GWAS populations that we could not access to train genomic PRS models. Our results suggest that larger and more inclusive GWASs may be needed for genomic PRS to be useful for ALL. Moreover, the comparable performance between populations may suggest a more oligogenic architecture for ALL, where some large effect loci may be shared between populations. Future PRS models that move away from the infinite causal loci assumption may further improve PRS for ALL.
多基因风险评分 (PRS) 模型在儿童急性淋巴细胞白血病 (ALL) 中的应用尚未得到全面评估,ALL 是儿童中最常见的癌症类型。之前的 ALL PRS 模型是基于全基因组关联研究 (GWAS) 中观察到的显著基因座建立的,尽管基因组 PRS 模型已被证明可以提高许多复杂疾病的预测性能。在美国,拉丁裔 (LAT) 儿童 ALL 风险最高,但 PRS 模型在 LAT 儿童中的可转移性尚未得到研究。在这项研究中,我们构建并评估了基于非拉丁裔白人 (NLW) GWAS 或多血统 GWAS 的基因组 PRS 模型。我们发现,最佳 PRS 模型在保留的 NLW 和 LAT 样本之间表现相似(NLW 中的 PseudoR 为 0.086 ± 0.023,LAT 中的 PseudoR 为 0.060 ± 0.020),如果我们仅在 LAT 或多血统样本中进行 GWAS,则可以提高 LAT 的 PRS(LAT 中的 PseudoR 为 0.116 ± 0.026,多血统样本中的 PseudoR 为 0.131 ± 0.025)。然而,目前最佳的基因组模型并没有比使用文献中所有已知 ALL 相关基因座的传统模型更好的预测准确性(PseudoR 为 0.166 ± 0.025),该模型包括我们无法访问以训练基因组 PRS 模型的 GWAS 人群中的基因座。我们的研究结果表明,对于 ALL 基因组 PRS 模型来说,可能需要更大和更具包容性的 GWAS 才能发挥作用。此外,不同人群之间的表现相当可能表明 ALL 的遗传结构更为寡基因,一些大效应基因座可能在人群之间共享。未来,偏离无限因果基因座假设的 PRS 模型可能会进一步提高 ALL 的 PRS。