Cohen Shai, Bergman Shaked, Lynn Nicolas, Tuller Tamir
Department of Biomedical Engineering, Tel Aviv University, Tel-Aviv, 6997801, Israel.
Sagol School of Neuroscience, Tel Aviv University, Tel-Aviv, 6997801, Israel.
Genome Med. 2024 Dec 23;16(1):152. doi: 10.1186/s13073-024-01420-6.
CRISPR is widely used to silence genes by inducing mutations expected to nullify their expression. While numerous computational tools have been developed to design single-guide RNAs (sgRNAs) with high cutting efficiency and minimal off-target effects, only a few tools focus specifically on predicting gene knockouts following CRISPR. These tools consider factors like conservation, amino acid composition, and frameshift likelihood. However, they neglect the impact of CRISPR on gene expression, which can dramatically affect the success of CRISPR-induced gene silencing attempts. Furthermore, information regarding gene expression can be useful even when the objective is not to silence a gene. Therefore, a tool that considers gene expression when predicting CRISPR outcomes is lacking.
We developed EXPosition, the first computational tool that combines models predicting gene knockouts after CRISPR with models that forecast gene expression, offering more accurate predictions of gene knockout outcomes. EXPosition leverages deep-learning models to predict key steps in gene expression: transcription, splicing, and translation initiation. We showed our tool performs better at predicting gene knockout than existing tools across 6 datasets, 4 cell types and ~207k sgRNAs. We also validated our gene expression models using the ClinVar dataset by showing enrichment of pathogenic mutations in high-scoring mutations according to our models.
We believe EXPosition will enhance both the efficiency and accuracy of genome editing projects, by directly predicting CRISPR's effect on various aspects of gene expression. EXPosition is available at http://www.cs.tau.ac.il/~tamirtul/EXPosition . The source code is available at https://github.com/shaicoh3n/EXPosition .
CRISPR被广泛用于通过诱导预期使基因表达无效的突变来沉默基因。虽然已经开发了许多计算工具来设计具有高切割效率和最小脱靶效应的单向导RNA(sgRNA),但只有少数工具专门关注预测CRISPR后的基因敲除。这些工具考虑了保守性、氨基酸组成和移码可能性等因素。然而,它们忽略了CRISPR对基因表达的影响,而这可能会极大地影响CRISPR诱导的基因沉默尝试的成功率。此外,即使目标不是沉默基因,有关基因表达的信息也可能有用。因此,缺乏一种在预测CRISPR结果时考虑基因表达的工具。
我们开发了EXPosition,这是第一个将预测CRISPR后基因敲除的模型与预测基因表达的模型相结合的计算工具,能够更准确地预测基因敲除结果。EXPosition利用深度学习模型来预测基因表达中的关键步骤:转录、剪接和翻译起始。我们表明,在6个数据集、4种细胞类型和大约20.7万个sgRNA上,我们的工具在预测基因敲除方面比现有工具表现更好。我们还使用ClinVar数据集验证了我们的基因表达模型,通过展示根据我们的模型在高分突变中致病性突变的富集情况。
我们相信EXPosition将通过直接预测CRISPR对基因表达各个方面的影响,提高基因组编辑项目的效率和准确性。EXPosition可在http://www.cs.tau.ac.il/~tamirtul/EXPosition获取。源代码可在https://github.com/shaicoh3n/EXPosition获取。