• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用 BERT 预测错配和插入/缺失对可解释的 CRISPR/Cas9 脱靶活性。

Interpretable CRISPR/Cas9 off-target activities with mismatches and indels prediction using BERT.

机构信息

College of Engineering, Shantou University, Shantou, 515063, China.

College of Engineering, Shantou University, Shantou, 515063, China.

出版信息

Comput Biol Med. 2024 Feb;169:107932. doi: 10.1016/j.compbiomed.2024.107932. Epub 2024 Jan 1.

DOI:10.1016/j.compbiomed.2024.107932
PMID:38199209
Abstract

Off-target effects of CRISPR/Cas9 can lead to suboptimal genome editing outcomes. Numerous deep learning-based approaches have achieved excellent performance for off-target prediction; however, few can predict the off-target activities with both mismatches and indels between single guide RNA (sgRNA) and target DNA sequence pair. In addition, data imbalance is a common pitfall for off-target prediction. Moreover, due to the complexity of genomic contexts, generating an interpretable model also remains challenged. To address these issues, firstly we developed a BERT-based model called CRISPR-BERT for enhancing the prediction of off-target activities with both mismatches and indels. Secondly, we proposed an adaptive batch-wise class balancing strategy to combat the noise exists in imbalanced off-target data. Finally, we applied a visualization approach for investigating the generalizable nucleotide position-dependent patterns of sgRNA-DNA pair for off-target activity. In our comprehensive comparison to existing methods on five mismatches-only datasets and two mismatches-and-indels datasets, CRISPR-BERT achieved the best performance in terms of AUROC and PRAUC. Besides, the visualization analysis demonstrated how implicit knowledge learned by CRISPR-BERT facilitates off-target prediction, which shows potential in model interpretability. Collectively, CRISPR-BERT provides an accurate and interpretable framework for off-target prediction, further contributes to sgRNA optimization in practical use for improved target specificity in CRISPR/Cas9 genome editing. The source code is available at https://github.com/BrokenStringx/CRISPR-BERT.

摘要

CRISPR/Cas9 的脱靶效应可能导致基因组编辑结果不理想。许多基于深度学习的方法在脱靶预测方面取得了优异的性能;然而,很少有方法可以预测 sgRNA 和目标 DNA 序列对之间存在错配和插入/缺失的脱靶活性。此外,数据不平衡是脱靶预测的常见陷阱。此外,由于基因组背景的复杂性,生成可解释的模型仍然具有挑战性。为了解决这些问题,我们首先开发了一种基于 BERT 的模型,称为 CRISPR-BERT,用于增强对存在错配和插入/缺失的脱靶活性的预测。其次,我们提出了一种自适应分批类平衡策略,以克服不平衡脱靶数据中的噪声。最后,我们应用了一种可视化方法来研究 sgRNA-DNA 对脱靶活性的通用核苷酸位置相关模式。在我们对五个仅存在错配数据集和两个存在错配和插入/缺失数据集的现有方法的综合比较中,CRISPR-BERT 在 AUROC 和 PRAUC 方面取得了最佳性能。此外,可视化分析展示了 CRISPR-BERT 学习的隐含知识如何有助于脱靶预测,这表明模型可解释性方面具有潜力。总之,CRISPR-BERT 为脱靶预测提供了一个准确和可解释的框架,进一步有助于 sgRNA 在 CRISPR/Cas9 基因组编辑中的实际应用中的优化,以提高靶标特异性。源代码可在 https://github.com/BrokenStringx/CRISPR-BERT 上获得。

相似文献

1
Interpretable CRISPR/Cas9 off-target activities with mismatches and indels prediction using BERT.使用 BERT 预测错配和插入/缺失对可解释的 CRISPR/Cas9 脱靶活性。
Comput Biol Med. 2024 Feb;169:107932. doi: 10.1016/j.compbiomed.2024.107932. Epub 2024 Jan 1.
2
Crispr-SGRU: Prediction of CRISPR/Cas9 Off-Target Activities with Mismatches and Indels Using Stacked BiGRU.CRISPR-SGRU:使用堆叠 BiGRU 预测具有错配和插入缺失的 CRISPR/Cas9 脱靶活性。
Int J Mol Sci. 2024 Oct 11;25(20):10945. doi: 10.3390/ijms252010945.
3
CrnnCrispr: An Interpretable Deep Learning Method for CRISPR/Cas9 sgRNA On-Target Activity Prediction.CrnnCrispr:一种用于CRISPR/Cas9 sgRNA靶向活性预测的可解释深度学习方法。
Int J Mol Sci. 2024 Apr 17;25(8):4429. doi: 10.3390/ijms25084429.
4
Benchmarking deep learning methods for predicting CRISPR/Cas9 sgRNA on- and off-target activities.基于深度学习方法的 CRISPR/Cas9 sgRNA 靶标和脱靶活性预测基准测试
Brief Bioinform. 2023 Sep 22;24(6). doi: 10.1093/bib/bbad333.
5
CRISPR-M: Predicting sgRNA off-target effect using a multi-view deep learning network.CRISPR-M:使用多视图深度学习网络预测 sgRNA 脱靶效应。
PLoS Comput Biol. 2024 Mar 14;20(3):e1011972. doi: 10.1371/journal.pcbi.1011972. eCollection 2024 Mar.
6
Off-target predictions in CRISPR-Cas9 gene editing using deep learning.使用深度学习进行 CRISPR-Cas9 基因编辑中的脱靶预测。
Bioinformatics. 2018 Sep 1;34(17):i656-i663. doi: 10.1093/bioinformatics/bty554.
7
[Prediction of CRISPR/Cas9 off-target activity using multi-scale convolutional neural network].[使用多尺度卷积神经网络预测CRISPR/Cas9脱靶活性]
Sheng Wu Gong Cheng Xue Bao. 2024 Mar 25;40(3):858-876. doi: 10.13345/j.cjb.230382.
8
Synergizing CRISPR/Cas9 off-target predictions for ensemble insights and practical applications.协同 CRISPR/Cas9 脱靶预测以获得综合见解和实际应用。
Bioinformatics. 2019 Apr 1;35(7):1108-1115. doi: 10.1093/bioinformatics/bty748.
9
Genome-wide CRISPR off-target prediction and optimization using RNA-DNA interaction fingerprints.利用 RNA-DNA 相互作用指纹进行全基因组 CRISPR 脱靶预测和优化。
Nat Commun. 2023 Nov 18;14(1):7521. doi: 10.1038/s41467-023-42695-4.
10
Amplification-free long-read sequencing reveals unforeseen CRISPR-Cas9 off-target activity.无扩增长读测序揭示了意想不到的 CRISPR-Cas9 脱靶活性。
Genome Biol. 2020 Dec 1;21(1):290. doi: 10.1186/s13059-020-02206-w.

引用本文的文献

1
Language Modelling Techniques for Analysing the Impact of Human Genetic Variation.用于分析人类基因变异影响的语言建模技术
Bioinform Biol Insights. 2025 Sep 2;19:11779322251358314. doi: 10.1177/11779322251358314. eCollection 2025.
2
Off-target sequence variations driven by the intrinsic properties of the Cas-sgRNA-DNA complex in genome editing.基因组编辑中由Cas-sgRNA-DNA复合物的内在特性驱动的脱靶序列变异。
PLoS One. 2025 Jul 18;20(7):e0328905. doi: 10.1371/journal.pone.0328905. eCollection 2025.
3
From Code to Life: The AI-Driven Revolution in Genome Editing.
从代码到生命:基因组编辑中的人工智能驱动革命
Adv Sci (Weinh). 2025 Aug;12(30):e17029. doi: 10.1002/advs.202417029. Epub 2025 Jun 19.
4
A versatile CRISPR/Cas9 system off-target prediction tool using language model.一种使用语言模型的多功能CRISPR/Cas9系统脱靶预测工具。
Commun Biol. 2025 Jun 6;8(1):882. doi: 10.1038/s42003-025-08275-6.
5
Deep Learning Based Models for CRISPR/Cas Off-Target Prediction.基于深度学习的CRISPR/Cas脱靶预测模型
Small Methods. 2025 Jul;9(7):e2500122. doi: 10.1002/smtd.202500122. Epub 2025 Jun 4.
6
CRISPR-MFH: A Lightweight Hybrid Deep Learning Framework with Multi-Feature Encoding for Improved CRISPR-Cas9 Off-Target Prediction.CRISPR-MFH:一种用于改进CRISPR-Cas9脱靶预测的具有多特征编码的轻量级混合深度学习框架。
Genes (Basel). 2025 Mar 28;16(4):387. doi: 10.3390/genes16040387.
7
Transitioning from wet lab to artificial intelligence: a systematic review of AI predictors in CRISPR.从湿实验室到人工智能的转变:对CRISPR中人工智能预测因子的系统综述
J Transl Med. 2025 Feb 4;23(1):153. doi: 10.1186/s12967-024-06013-w.
8
DeepIndel: An Interpretable Deep Learning Approach for Predicting CRISPR/Cas9-Mediated Editing Outcomes.DeepIndel:一种用于预测 CRISPR/Cas9 介导的编辑结果的可解释深度学习方法。
Int J Mol Sci. 2024 Oct 11;25(20):10928. doi: 10.3390/ijms252010928.
9
The Evolution of Nucleic Acid-Based Diagnosis Methods from the (pre-)CRISPR to CRISPR era and the Associated Machine/Deep Learning Approaches in Relevant RNA Design.从(前)CRISPR 时代到 CRISPR 时代的核酸诊断方法的演变,以及相关 RNA 设计中的机器/深度学习方法。
Methods Mol Biol. 2025;2847:241-300. doi: 10.1007/978-1-0716-4079-1_17.
10
Generating, modeling and evaluating a large-scale set of CRISPR/Cas9 off-target sites with bulges.生成、建模和评估具有凸起结构的大规模 CRISPR/Cas9 脱靶位点集。
Nucleic Acids Res. 2024 Jul 8;52(12):6777-6790. doi: 10.1093/nar/gkae428.