Alipanahi Roghayyeh, Safari Leila, Khanteymoori Alireza
Department of Computer Engineering, University of Zanjan, Zanjan, Iran.
Department of Neurozentrum, Universitätsklinikum Freiburg, Freiburg, Germany.
Front Bioinform. 2023 Jan 11;2:1001131. doi: 10.3389/fbinf.2022.1001131. eCollection 2022.
Clustered regularly interspaced short palindromic repeats (CRISPR)-based gene editing has been widely used in various cell types and organisms. To make genome editing with Clustered regularly interspaced short palindromic repeats far more precise and practical, we must concentrate on the design of optimal gRNA and the selection of appropriate Cas enzymes. Numerous computational tools have been created in recent years to help researchers design the best gRNA for Clustered regularly interspaced short palindromic repeats researches. There are two approaches for designing an appropriate gRNA sequence (which targets our desired sites with high precision): experimental and predicting-based approaches. It is essential to reduce off-target sites when designing an optimal gRNA. Here we review both traditional and machine learning-based approaches for designing an appropriate gRNA sequence and predicting off-target sites. In this review, we summarize the key characteristics of all available tools (as far as possible) and compare them together. Machine learning-based tools and web servers are believed to become the most effective and reliable methods for predicting on-target and off-target activities of Clustered regularly interspaced short palindromic repeats in the future. However, these predictions are not so precise now and the performance of these algorithms -especially deep learning one's-depends on the amount of data used during training phase. So, as more features are discovered and incorporated into these models, predictions become more in line with experimental observations. We must concentrate on the creation of ideal gRNA and the choice of suitable Cas enzymes in order to make genome editing with Clustered regularly interspaced short palindromic repeats far more accurate and feasible.
基于成簇规律间隔短回文重复序列(CRISPR)的基因编辑已广泛应用于各种细胞类型和生物体中。为了使基于成簇规律间隔短回文重复序列的基因组编辑更加精确和实用,我们必须专注于优化gRNA的设计和合适的Cas酶的选择。近年来,已经创建了许多计算工具来帮助研究人员为成簇规律间隔短回文重复序列研究设计最佳的gRNA。设计合适的gRNA序列(以高精度靶向我们期望的位点)有两种方法:实验方法和基于预测的方法。在设计最佳gRNA时,减少脱靶位点至关重要。在这里,我们回顾了设计合适的gRNA序列和预测脱靶位点的传统方法和基于机器学习的方法。在这篇综述中,我们总结了所有可用工具的关键特征(尽可能全面)并进行了比较。基于机器学习的工具和网络服务器有望在未来成为预测成簇规律间隔短回文重复序列的靶向和脱靶活性最有效、最可靠的方法。然而,目前这些预测还不够精确,并且这些算法的性能——尤其是深度学习算法的性能——取决于训练阶段使用的数据量。因此,随着更多特征被发现并纳入这些模型,预测将更符合实验观察结果。为了使基于成簇规律间隔短回文重复序列的基因组编辑更加准确和可行,我们必须专注于理想gRNA的创建和合适的Cas酶的选择。