使用深度学习进行 CRISPR-Cas9 基因编辑中的脱靶预测。

Off-target predictions in CRISPR-Cas9 gene editing using deep learning.

机构信息

Department of Computer Science, City University of Hong Kong, Kowloon Tong, Hong Kong SAR.

出版信息

Bioinformatics. 2018 Sep 1;34(17):i656-i663. doi: 10.1093/bioinformatics/bty554.

DOI:10.1093/bioinformatics/bty554

PMID:30423072

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6129261/

Abstract

MOTIVATION

The prediction of off-target mutations in CRISPR-Cas9 is a hot topic due to its relevance to gene editing research. Existing prediction methods have been developed; however, most of them just calculated scores based on mismatches to the guide sequence in CRISPR-Cas9. Therefore, the existing prediction methods are unable to scale and improve their performance with the rapid expansion of experimental data in CRISPR-Cas9. Moreover, the existing methods still cannot satisfy enough precision in off-target predictions for gene editing at the clinical level.

RESULTS

To address it, we design and implement two algorithms using deep neural networks to predict off-target mutations in CRISPR-Cas9 gene editing (i.e. deep convolutional neural network and deep feedforward neural network). The models were trained and tested on the recently released off-target dataset, CRISPOR dataset, for performance benchmark. Another off-target dataset identified by GUIDE-seq was adopted for additional evaluation. We demonstrate that convolutional neural network achieves the best performance on CRISPOR dataset, yielding an average classification area under the ROC curve (AUC) of 97.2% under stratified 5-fold cross-validation. Interestingly, the deep feedforward neural network can also be competitive at the average AUC of 97.0% under the same setting. We compare the two deep neural network models with the state-of-the-art off-target prediction methods (i.e. CFD, MIT, CROP-IT, and CCTop) and three traditional machine learning models (i.e. random forest, gradient boosting trees, and logistic regression) on both datasets in terms of AUC values, demonstrating the competitive edges of the proposed algorithms. Additional analyses are conducted to investigate the underlying reasons from different perspectives.

AVAILABILITY AND IMPLEMENTATION

The example code are available at https://github.com/MichaelLinn/off_target_prediction. The related datasets are available at https://github.com/MichaelLinn/off_target_prediction/tree/master/data.

摘要

动机

由于与基因编辑研究相关，CRISPR-Cas9 脱靶突变的预测是一个热门话题。已经开发出了现有的预测方法；然而，大多数方法只是根据 CRISPR-Cas9 中的引导序列的不匹配来计算分数。因此，现有的预测方法无法随着 CRISPR-Cas9 中实验数据的快速扩展而扩展和提高其性能。此外，现有的方法仍然不能满足临床水平基因编辑中脱靶预测的足够精度。

结果

为了解决这个问题，我们使用深度神经网络设计并实现了两种算法来预测 CRISPR-Cas9 基因编辑中的脱靶突变（即卷积神经网络和深度前馈神经网络）。模型在最近发布的脱靶数据集 CRISPOR 数据集上进行了训练和测试，以进行性能基准测试。另一个由 GUIDE-seq 确定的脱靶数据集被用于额外的评估。我们证明，卷积神经网络在 CRISPOR 数据集上取得了最佳性能，在分层 5 倍交叉验证下平均 ROC 曲线下的分类面积（AUC）为 97.2%。有趣的是，在相同设置下，深度前馈神经网络也可以具有竞争力，平均 AUC 为 97.0%。我们在两个数据集上，根据 AUC 值将这两种深度神经网络模型与最先进的脱靶预测方法（即 CFD、MIT、CROP-IT 和 CCTop）以及三种传统机器学习模型（即随机森林、梯度提升树和逻辑回归）进行比较，展示了所提出算法的竞争优势。进行了额外的分析，从不同角度探讨了潜在的原因。

可用性和实现

示例代码可在 https://github.com/MichaelLinn/off_target_prediction 上获得。相关数据集可在 https://github.com/MichaelLinn/off_target_prediction/tree/master/data 上获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0991/6129261/31b83f0c4a5f/bty554f1.jpg

相似文献

Off-target predictions in CRISPR-Cas9 gene editing using deep learning.

Bioinformatics. 2018 Sep 1;34(17):i656-i663. doi: 10.1093/bioinformatics/bty554.

Accurate deep learning off-target prediction with novel sgRNA-DNA sequence encoding in CRISPR-Cas9 gene editing.

Bioinformatics. 2021 Aug 25;37(16):2299-2307. doi: 10.1093/bioinformatics/btab112.

R-CRISPR: A Deep Learning Network to Predict Off-Target Activities with Mismatch, Insertion and Deletion in CRISPR-Cas9 System.

Genes (Basel). 2021 Nov 25;12(12):1878. doi: 10.3390/genes12121878.

Using traditional machine learning and deep learning methods for on- and off-target prediction in CRISPR/Cas9: a review.

Brief Bioinform. 2023 May 19;24(3). doi: 10.1093/bib/bbad131.

DeepCRISTL: deep transfer learning to predict CRISPR/Cas9 functional and endogenous on-target editing efficiency.

Bioinformatics. 2022 Jun 24;38(Suppl 1):i161-i168. doi: 10.1093/bioinformatics/btac218.

CROTON: an automated and variant-aware deep learning framework for predicting CRISPR/Cas9 editing outcomes.

Bioinformatics. 2021 Jul 12;37(Suppl_1):i342-i348. doi: 10.1093/bioinformatics/btab268.

Prediction of sgRNA on-target activity in bacteria by deep learning.

BMC Bioinformatics. 2019 Oct 24;20(1):517. doi: 10.1186/s12859-019-3151-4.

Deep learning improves the ability of sgRNA off-target propensity prediction.

BMC Bioinformatics. 2020 Feb 10;21(1):51. doi: 10.1186/s12859-020-3395-z.

[Prediction of CRISPR/Cas9 off-target activity using multi-scale convolutional neural network].

Sheng Wu Gong Cheng Xue Bao. 2024 Mar 25;40(3):858-876. doi: 10.13345/j.cjb.230382.

Interpretable CRISPR/Cas9 off-target activities with mismatches and indels prediction using BERT.

Comput Biol Med. 2024 Feb;169:107932. doi: 10.1016/j.compbiomed.2024.107932. Epub 2024 Jan 1.

引用本文的文献

Tapping the microalgal potential: genetic precision and stress-induction for enhanced astaxanthin and biofuel production.

Biotechnol Biofuels Bioprod. 2025 Aug 14;18(1):92. doi: 10.1186/s13068-025-02656-z.

Off-target effects in CRISPR-Cas genome editing for human therapeutics: Progress and challenges.

Mol Ther Nucleic Acids. 2025 Jul 17;36(3):102636. doi: 10.1016/j.omtn.2025.102636. eCollection 2025 Sep 9.

CNN-Based Automatic Tablet Classification Using a Vibration-Controlled Bowl Feeder with Spiral Torque Optimization.

Sensors (Basel). 2025 Jul 8;25(14):4248. doi: 10.3390/s25144248.

Off-target sequence variations driven by the intrinsic properties of the Cas-sgRNA-DNA complex in genome editing.

PLoS One. 2025 Jul 18;20(7):e0328905. doi: 10.1371/journal.pone.0328905. eCollection 2025.

Artificial Intelligence in CRISPR-Cas Systems: A Review of Tool Applications.

Methods Mol Biol. 2025;2952:243-257. doi: 10.1007/978-1-0716-4690-8_14.

From Code to Life: The AI-Driven Revolution in Genome Editing.

Adv Sci (Weinh). 2025 Aug;12(30):e17029. doi: 10.1002/advs.202417029. Epub 2025 Jun 19.

Unlocking the potential of flavonoid biosynthesis through integrated metabolic engineering.

Front Plant Sci. 2025 May 29;16:1597007. doi: 10.3389/fpls.2025.1597007. eCollection 2025.

A versatile CRISPR/Cas9 system off-target prediction tool using language model.

Commun Biol. 2025 Jun 6;8(1):882. doi: 10.1038/s42003-025-08275-6.

Deep Learning Based Models for CRISPR/Cas Off-Target Prediction.

Small Methods. 2025 Jul;9(7):e2500122. doi: 10.1002/smtd.202500122. Epub 2025 Jun 4.

Learning to utilize internal protein 3D nanoenvironment descriptors in predicting CRISPR-Cas9 off-target activity.

NAR Genom Bioinform. 2025 May 21;7(2):lqaf054. doi: 10.1093/nargab/lqaf054. eCollection 2025 Jun.

本文引用的文献

DeepSF: deep convolutional neural network for mapping protein sequences to folds.

Bioinformatics. 2018 Apr 15;34(8):1295-1303. doi: 10.1093/bioinformatics/btx780.

Real-space and real-time dynamics of CRISPR-Cas9 visualized by high-speed atomic force microscopy.

Nat Commun. 2017 Nov 10;8(1):1430. doi: 10.1038/s41467-017-01466-8.

A CRISPR-Cas9-based gene drive platform for genetic interaction analysis in Candida albicans.

Nat Microbiol. 2018 Jan;3(1):73-82. doi: 10.1038/s41564-017-0043-0. Epub 2017 Oct 23.

DeepLoc: prediction of protein subcellular localization using deep learning.

Bioinformatics. 2017 Nov 1;33(21):3387-3395. doi: 10.1093/bioinformatics/btx431.

Addressing challenges in the clinical applications associated with CRISPR/Cas9 technology and ethical questions to prevent its misuse.

Protein Cell. 2017 Nov;8(11):791-795. doi: 10.1007/s13238-017-0477-4.

An introduction to deep learning on biological sequence data: examples and solutions.

Bioinformatics. 2017 Nov 15;33(22):3685-3690. doi: 10.1093/bioinformatics/btx531.

Enhanced proofreading governs CRISPR-Cas9 targeting accuracy.

Nature. 2017 Oct 19;550(7676):407-410. doi: 10.1038/nature24268. Epub 2017 Sep 20.

Correction of a pathogenic gene mutation in human embryos.

Nature. 2017 Aug 24;548(7668):413-419. doi: 10.1038/nature23305. Epub 2017 Aug 2.

In vivo CRISPR screening identifies Ptpn2 as a cancer immunotherapy target.

Nature. 2017 Jul 27;547(7664):413-418. doi: 10.1038/nature23270. Epub 2017 Jul 19.

Targeted activation of diverse CRISPR-Cas systems for mammalian genome editing via proximal CRISPR targeting.

Nat Commun. 2017 Apr 7;8:14958. doi: 10.1038/ncomms14958.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

使用深度学习进行 CRISPR-Cas9 基因编辑中的脱靶预测。

Off-target predictions in CRISPR-Cas9 gene editing using deep learning.

机构信息

Department of Computer Science, City University of Hong Kong, Kowloon Tong, Hong Kong SAR.