Department of Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, Jiangsu 221004, China.
Center for Medical Statistics and Data Analysis, Xuzhou Medical University, Xuzhou, Jiangsu, 221004, China.
Brief Bioinform. 2023 Jul 20;24(4). doi: 10.1093/bib/bbad232.
Trans-ethnic genome-wide association studies have revealed that many loci identified in European populations can be reproducible in non-European populations, indicating widespread trans-ethnic genetic similarity. However, how to leverage such shared information more efficiently in association analysis is less investigated for traits in underrepresented populations. We here propose a statistical framework, trans-ethnic genetic risk score informed gene-based association mixed model (GAMM), by hierarchically modeling single-nucleotide polymorphism effects in the target population as a function of effects of the same trait in well-studied populations. GAMM powerfully integrates genetic similarity across distinct ancestral groups to enhance power in understudied populations, as confirmed by extensive simulations. We illustrate the usefulness of GAMM via the application to 13 blood cell traits (i.e. basophil count, eosinophil count, hematocrit, hemoglobin concentration, lymphocyte count, mean corpuscular hemoglobin, mean corpuscular hemoglobin concentration, mean corpuscular volume, monocyte count, neutrophil count, platelet count, red blood cell count and total white blood cell count) in Africans of the UK Biobank (n = 3204) while utilizing genetic overlap shared in Europeans (n = 746 667) and East Asians (n = 162 255). We discovered multiple new associated genes, which had otherwise been missed by existing methods, and revealed that the trans-ethnic information indirectly contributed much to the phenotypic variance. Overall, GAMM represents a flexible and powerful statistical framework of association analysis for complex traits in underrepresented populations by integrating trans-ethnic genetic similarity across well-studied populations, and helps attenuate health inequities in current genetics research for people of minority populations.
跨种族全基因组关联研究表明,许多在欧洲人群中发现的基因座可以在非欧洲人群中重现,这表明存在广泛的跨种族遗传相似性。然而,对于代表性不足的人群中的特征,如何更有效地利用这种共享信息进行关联分析研究较少。我们在这里提出了一个统计框架,跨种族遗传风险评分信息基因关联混合模型(GAMM),通过将目标人群中的单核苷酸多态性效应分层建模为在充分研究的人群中相同特征的效应的函数。GAMM 有力地整合了不同祖先群体之间的遗传相似性,以增强在研究不足的人群中的功效,这在广泛的模拟中得到了证实。我们通过在英国生物库中的非洲人(n=3204)中应用于 13 个血细胞特征(即嗜碱性粒细胞计数、嗜酸性粒细胞计数、血细胞比容、血红蛋白浓度、淋巴细胞计数、平均红细胞血红蛋白、平均红细胞血红蛋白浓度、平均红细胞体积、单核细胞计数、中性粒细胞计数、血小板计数、红细胞计数和总白细胞计数),同时利用欧洲人(n=746667)和东亚人(n=162255)共享的遗传重叠,展示了 GAMM 的有用性。我们发现了多个新的关联基因,否则这些基因会被现有方法所忽略,并揭示了跨种族信息间接对表型方差有很大贡献。总体而言,GAMM 通过整合充分研究的人群中的跨种族遗传相似性,为代表性不足的人群中的复杂特征提供了一种灵活而强大的关联分析统计框架,并有助于减轻当前少数族裔人群遗传学研究中的健康不平等。