Zhu Wentao, Xie Huanzeng, Chen Yaowen, Zhang Guishan
College of Engineering, Shantou University, Shantou 515063, China.
Int J Mol Sci. 2024 Apr 17;25(8):4429. doi: 10.3390/ijms25084429.
CRISPR/Cas9 is a powerful genome-editing tool in biology, but its wide applications are challenged by a lack of knowledge governing single-guide RNA (sgRNA) activity. Several deep-learning-based methods have been developed for the prediction of on-target activity. However, there is still room for improvement. Here, we proposed a hybrid neural network named CrnnCrispr, which integrates a convolutional neural network and a recurrent neural network for on-target activity prediction. We performed unbiased experiments with four mainstream methods on nine public datasets with varying sample sizes. Additionally, we incorporated a transfer learning strategy to boost the prediction power on small-scale datasets. Our results showed that CrnnCrispr outperformed existing methods in terms of accuracy and generalizability. Finally, we applied a visualization approach to investigate the generalizable nucleotide-position-dependent patterns of sgRNAs for on-target activity, which shows potential in terms of model interpretability and further helps in understanding the principles of sgRNA design.
CRISPR/Cas9是生物学中一种强大的基因组编辑工具,但其广泛应用受到缺乏对单向导RNA(sgRNA)活性相关知识的挑战。已经开发了几种基于深度学习的方法来预测靶向活性。然而,仍有改进空间。在此,我们提出了一种名为CrnnCrispr的混合神经网络,它整合了卷积神经网络和循环神经网络用于靶向活性预测。我们使用四种主流方法在九个不同样本量的公共数据集上进行了无偏实验。此外,我们纳入了迁移学习策略以提高在小规模数据集上的预测能力。我们的结果表明,CrnnCrispr在准确性和泛化性方面优于现有方法。最后,我们应用了一种可视化方法来研究sgRNA靶向活性的可泛化的核苷酸位置依赖性模式,这在模型可解释性方面显示出潜力,并进一步有助于理解sgRNA设计的原理。