MRC Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, UK.
NHS Lothian, Edinburgh, UK.
Genome Biol. 2024 Aug 6;25(1):208. doi: 10.1186/s13059-024-03352-1.
Genome-wide association studies (GWAS) have revealed a multitude of candidate genetic variants affecting the risk of developing complex traits and diseases. However, the highlighted regions are typically in the non-coding genome, and uncovering the functional causative single nucleotide variants (SNVs) is challenging. Prioritization of variants is commonly based on genomic annotation with markers of active regulatory elements, but current approaches still poorly predict functional variants. To address this, we systematically analyze six markers of active regulatory elements for their ability to identify functional variants.
We benchmark against molecular quantitative trait loci (molQTL) from assays of regulatory element activity that identify allelic effects on DNA-binding factor occupancy, reporter assay expression, and chromatin accessibility. We identify the combination of DNase footprints and divergent enhancer RNA (eRNA) as markers for functional variants. This signature provides high precision, but with a trade-off of low recall, thus substantially reducing candidate variant sets to prioritize variants for functional validation. We present this as a framework called FINDER-Functional SNV IdeNtification using DNase footprints and eRNA.
We demonstrate the utility to prioritize variants using leukocyte count trait and analyze variants in linkage disequilibrium with a lead variant to predict a functional variant in asthma. Our findings have implications for prioritizing variants from GWAS, in development of predictive scoring algorithms, and for functionally informed fine mapping approaches.
全基因组关联研究(GWAS)揭示了许多影响复杂性状和疾病风险的候选遗传变异。然而,突出的区域通常在非编码基因组中,揭示功能因果单核苷酸变异(SNV)具有挑战性。变异的优先级通常基于具有活性调控元件标记的基因组注释,但目前的方法仍然难以预测功能变异。为了解决这个问题,我们系统地分析了六种活性调控元件的标记,以确定它们识别功能变异的能力。
我们将其与调控元件活性的分子数量性状基因座(molQTL)进行基准测试,这些基因座确定了等位基因对 DNA 结合因子占据、报告基因表达和染色质可及性的影响。我们确定了 DNase 足迹和发散增强子 RNA(eRNA)作为功能变异标记的组合。该特征提供了高精度,但有召回率低的权衡,因此大大减少了候选变异集,以优先考虑功能验证的变异。我们将其作为一个名为 FINDER 的框架提出,即使用 DNase 足迹和 eRNA 进行功能 SNV 鉴定。
我们使用白细胞计数性状来证明使用此框架优先考虑变异的效用,并分析与领先变异连锁不平衡的变异,以预测哮喘中的功能变异。我们的研究结果对于从 GWAS 中优先考虑变异、开发预测评分算法以及功能导向的精细映射方法具有重要意义。