Zhou Kangping, Gheybi Kazzem, Soh Pamela X Y, Hayes Vanessa M
Ancestry and Health Genomics Laboratory, Charles Perkins Centre, School of Medical Sciences, Faculty of Medicine and Health, University of Sydney, Camperdown, Sydney, NSW, Australia.
Manchester Cancer Research Centre, University of Manchester, Manchester, UK.
Commun Med (Lond). 2025 May 6;5(1):157. doi: 10.1038/s43856-025-00883-x.
Genetic germline testing is restricted for African patients. Lack of ancestrally relevant genomic data perpetuated by African diversity has resulted in European-biased curated clinical variant databases and pathogenic prediction guidelines. While numerous variant pathogenicity prediction tools (VPPTs) exist, their performance has yet to be established within the context of African diversity.
To address this limitation, we assessed 54 VPPTs for predictive performance (sensitivity, specificity, false positive and negative rates) across 145,291 known pathogenic or benign variants derived from 50 Southern African and 50 European men matched for advanced prostate cancer. Prioritising VPPTs for optimal ancestral performance, we screened 5.3 million variants of unknown significance for predicted functional and oncogenic potential.
We observe a 2.1- and 4.1-fold increase in the number of known and predicted rare pathogenic or benign variants, respectively, against a 1.6-fold decrease in the number of available interrogated variants in our European over African data. Although sensitivity was significantly lower for our African data overall (0.66 vs 0.71, p = 9.86E-06), MetaSVM, CADD, Eigen-raw, BayesDel-noAF, phyloP100way-vertebrate and MVP outperformed irrespective of ancestry. Conversely, MutationTaster, DANN, LRT and GERP-RS were African-specific top performers, while MutationAssessor, PROVEAN, LIST-S2 and REVEL are European-specific. Using these pathogenic prediction workflows, we narrow the ancestral gap for potentially deleterious and oncogenic variant prediction in favour of our African data by 1.15- and 1.1-fold, respectively.
Although VPPT sensitivity favours European data, our findings provide guidelines for VPPT selection to maximise rare pathogenic variant prediction for African disease studies.
非洲患者的遗传性种系检测受到限制。非洲多样性导致缺乏与祖先相关的基因组数据,从而产生了以欧洲为偏向的临床变异数据库和致病性预测指南。虽然存在众多变异致病性预测工具(VPPT),但其在非洲多样性背景下的性能尚未确定。
为解决这一局限性,我们评估了54种VPPT在145291个已知致病性或良性变异中的预测性能(敏感性、特异性、假阳性和假阴性率),这些变异来自50名患有晚期前列腺癌的南非男性和50名欧洲男性。为了优先选择具有最佳祖先性能的VPPT,我们筛选了530万个意义未明的变异,以预测其功能和致癌潜力。
我们观察到,与欧洲数据相比,我们非洲数据中已知和预测的罕见致病性或良性变异数量分别增加了2.1倍和4.1倍,而可用查询变异数量减少了1.6倍。虽然总体而言我们非洲数据的敏感性显著较低(0.66对0.71,p = 9.86E-06),但MetaSVM、CADD、Eigen-raw、BayesDel-noAF、phyloP100way-vertebrate和MVP无论祖先如何均表现出色。相反,MutationTaster、DANN、LRT和GERP-RS是非洲特异性的顶级 performers,而MutationAssessor、PROVEAN、LIST-S2和REVEL是欧洲特异性的。使用这些致病性预测工作流程,我们分别将有利于我们非洲数据的潜在有害和致癌变异预测的祖先差距缩小了1.15倍和1.1倍。
虽然VPPT敏感性有利于欧洲数据,但我们的研究结果为VPPT选择提供了指导方针,以最大限度地提高非洲疾病研究中罕见致病性变异的预测。