Lady Davis Institute for Medical Research, Jewish General Hospital, Montréal, QC, H3T 1E2, Canada.
Quantitative Life Sciences Program, McGill University, Montréal, QC, H3A 0G4, Canada.
Hum Genet. 2023 Jun;142(6):749-758. doi: 10.1007/s00439-023-02548-y. Epub 2023 Apr 2.
GWAS has identified thousands of loci associated with disease, yet the causal genes within these loci remain largely unknown. Identifying these causal genes would enable deeper understanding of the disease and assist in genetics-based drug development. Exome-wide association studies (ExWAS) are more expensive but can pinpoint causal genes offering high-yield drug targets, yet suffer from a high false-negative rate. Several algorithms have been developed to prioritize genes at GWAS loci, such as the Effector Index (Ei), Locus-2-Gene (L2G), Polygenic Prioritization score (PoPs), and Activity-by-Contact score (ABC) and it is not known if these algorithms can predict ExWAS findings from GWAS data. However, if this were the case, thousands of associated GWAS loci could potentially be resolved to causal genes. Here, we quantified the performance of these algorithms by evaluating their ability to identify ExWAS significant genes for nine traits. We found that Ei, L2G, and PoPs can identify ExWAS significant genes with high areas under the precision recall curve (Ei: 0.52, L2G: 0.37, PoPs: 0.18, ABC: 0.14). Furthermore, we found that for every unit increase in the normalized scores, there was an associated 1.3-4.6-fold increase in the odds of a gene reaching exome-wide significance (Ei: 4.6, L2G: 2.5, PoPs: 2.1, ABC: 1.3). Overall, we found that Ei, L2G, and PoPs can anticipate ExWAS findings from widely available GWAS results. These techniques are therefore promising when well-powered ExWAS data are not readily available and can be used to anticipate ExWAS findings, allowing for prioritization of genes at GWAS loci.
GWAS 已经确定了数千个与疾病相关的基因座,但这些基因座中的因果基因在很大程度上仍然未知。鉴定这些因果基因将使我们更深入地了解疾病,并有助于基于遗传学的药物开发。外显子组全基因组关联研究 (ExWAS) 虽然更昂贵,但可以精确定位因果基因,提供高产量的药物靶点,但存在高假阴性率。已经开发了几种算法来优先考虑 GWAS 基因座中的基因,例如效应指数 (Ei)、基因座-基因 (L2G)、多基因优先排序分数 (PoPs) 和活性接触分数 (ABC),目前尚不清楚这些算法是否可以从 GWAS 数据预测 ExWAS 结果。然而,如果是这样,数以千计的相关 GWAS 基因座可能会被解析为因果基因。在这里,我们通过评估这些算法识别九个特征的 ExWAS 显著基因的能力来量化它们的性能。我们发现 Ei、L2G 和 PoPs 可以通过识别 ExWAS 显著基因来获得高的精确召回曲线下面积(Ei:0.52,L2G:0.37,PoPs:0.18,ABC:0.14)。此外,我们发现归一化分数每增加一个单位,基因达到外显子组全基因组显著性的几率就会增加 1.3-4.6 倍(Ei:4.6,L2G:2.5,PoPs:2.1,ABC:1.3)。总体而言,我们发现 Ei、L2G 和 PoPs 可以从广泛可用的 GWAS 结果中预测 ExWAS 结果。因此,当无法获得强大的 ExWAS 数据时,这些技术很有前途,并且可以用于预测 ExWAS 结果,从而优先考虑 GWAS 基因座中的基因。