Xie Huanzeng, Huang Lingze, Luo Ye, Zhang Guishan
College of Engineering, Shantou University, Shantou 515063, Guangdong, China.
Sheng Wu Gong Cheng Xue Bao. 2024 Mar 25;40(3):858-876. doi: 10.13345/j.cjb.230382.
Clustered regularly interspaced short palindromic repeat/CRISPR-associated protein 9 (CRISPR/Cas9) is a new generation of gene editing technology, which relies on single guide RNA to identify specific gene sites and guide Cas9 nuclease to edit specific location in the genome. However, the off-target effect of this technology hampers its development. In recent years, several deep learning models have been developed for predicting the CRISPR/Cas9 off-target activity, which contributes to more efficient and safe gene editing and gene therapy. However, the prediction accuracy remains to be improved. In this paper, we proposed a multi-scale convolutional neural network-based method, designated as CnnCRISPR, for CRISPR/Cas9 off-target prediction. First, we used one-hot encoding method to encode the sgRNA-DNA sequence pair, followed by a bitwise or operation on the two binary matrices. Second, the encoded sequence was fed into the Inception-based network for training and evaluating. Third, the well-trained model was applied to evaluate the off-target situation of the sgRNA-DNA sequence pair. Experiments on public datasets showed CnnCRISPR outperforms existing deep learning-based methods, which provides an effective and feasible method for addressing the off-target problems.
成簇规律间隔短回文重复序列/CRISPR相关蛋白9(CRISPR/Cas9)是新一代基因编辑技术,它依靠单导向RNA识别特定基因位点并引导Cas9核酸酶对基因组中的特定位置进行编辑。然而,该技术的脱靶效应阻碍了其发展。近年来,已开发出几种深度学习模型用于预测CRISPR/Cas9的脱靶活性,这有助于实现更高效、安全的基因编辑和基因治疗。然而,预测准确性仍有待提高。在本文中,我们提出了一种基于多尺度卷积神经网络的方法,命名为CnnCRISPR,用于CRISPR/Cas9脱靶预测。首先,我们使用独热编码方法对sgRNA-DNA序列对进行编码,然后对两个二进制矩阵进行按位或运算。其次,将编码后的序列输入基于Inception的网络进行训练和评估。第三,将训练良好的模型应用于评估sgRNA-DNA序列对的脱靶情况。在公共数据集上的实验表明,CnnCRISPR优于现有的基于深度学习的方法,为解决脱靶问题提供了一种有效且可行的方法。