Duan Jingqi, Gasch Audrey P, Keleş Sündüz
Department of Statistics, University of Wisconsin, Madison, WI, 53706, United States.
Laboratory of Genetics, University of Wiconsin, Madison, WI, 53706, United States.
Bioinformatics. 2024 Mar 29;40(4). doi: 10.1093/bioinformatics/btae181.
The ENCODE project generated a large collection of eCLIP-seq RNA binding protein (RBP) profiling data with accompanying RNA-seq transcriptomes of shRNA knockdown of RBPs. These data could have utility in understanding the functional impact of genetic variants, however their potential has not been fully exploited. We implement INCA (Integrative annotation scores of variants for impact on RBP activities) as a multi-step genetic variant scoring approach that leverages the ENCODE RBP data together with ClinVar and integrates multiple computational approaches to aggregate evidence.
INCA evaluates variant impacts on RBP activities by leveraging genotypic differences in cell lines used for eCLIP-seq. We show that INCA provides critical specificity, beyond generic scoring for RBP binding disruption, for candidate variants and their linkage-disequilibrium partners. As a result, it can, on average, augment scoring of 46.2% of the candidate variants beyond generic scoring for RBP binding disruption and aid in variant prioritization for follow-up analysis.
INCA is implemented in R and is available at https://github.com/keleslab/INCA.
ENCODE项目生成了大量eCLIP-seq RNA结合蛋白(RBP)谱数据以及RBP的shRNA敲低伴随RNA-seq转录组。这些数据可能有助于理解遗传变异的功能影响,但其潜力尚未得到充分利用。我们实施了INCA(影响RBP活性的变异综合注释分数)作为一种多步骤遗传变异评分方法,该方法利用ENCODE RBP数据以及ClinVar,并整合多种计算方法来汇总证据。
INCA通过利用用于eCLIP-seq的细胞系中的基因型差异来评估变异对RBP活性的影响。我们表明,INCA为候选变异及其连锁不平衡伙伴提供了关键的特异性,超越了对RBP结合破坏的一般评分。因此,平均而言,它可以将46.2%的候选变异评分提高到超出对RBP结合破坏的一般评分,并有助于对变异进行优先级排序以便后续分析。
INCA用R语言实现,可在https://github.com/keleslab/INCA获取。