• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用具有BERT嵌入的双向长短期记忆网络预测人类原代细胞中的CRISPR-Cas9脱靶效应。

Predicting CRISPR-Cas9 off-target effects in human primary cells using bidirectional LSTM with BERT embedding.

作者信息

Sari Orhan, Liu Ziying, Pan Youlian, Shao Xiaojian

机构信息

Department of Mining and Materials Engineering, McGill University, Montreal, QC, H3A 2B1, Canada.

Digital Technologies Research Center, National Research Council Canada, Ottawa, ON, K1A 0R6, Canada.

出版信息

Bioinform Adv. 2024 Dec 30;5(1):vbae184. doi: 10.1093/bioadv/vbae184. eCollection 2025.

DOI:10.1093/bioadv/vbae184
PMID:39758829
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11696696/
Abstract

MOTIVATION

Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR)-Cas9 system is a ground-breaking genome editing tool, which has revolutionized cell and gene therapies. One of the essential components involved in this system that ensures its success is the design of an optimal single-guide RNA (sgRNA) with high on-target cleavage efficiency and low off-target effects. This is challenging as many conditions need to be considered, and empirically testing every design is time-consuming and costly. prediction using machine learning models provides high-performance alternatives.

RESULTS

We present CrisprBERT, a deep learning model incorporating a Bidirectional Encoder Representations from Transformers (BERT) architecture to provide a high-dimensional embedding for paired sgRNA and DNA sequences and Bidirectional Long Short-term Memory networks for learning, to predict the off-target effects of sgRNAs utilizing only the sgRNAs and their paired DNA sequences. We proposed doublet stack encoding to capture the local energy configuration of the Cas9 binding and applied the BERT model to learn the contextual embedding of the doublet pairs. Our results showed that the new model achieved better performance than state-of-the-art deep learning models regarding single split and leave-one-sgRNA-out cross-validations as well as independent testing.

AVAILABILITY AND IMPLEMENTATION

The CrisprBERT is available at GitHub: https://github.com/OSsari/CrisprBERT.

摘要

动机

成簇规律间隔短回文重复序列(CRISPR)-Cas9系统是一种开创性的基因组编辑工具,它彻底改变了细胞和基因疗法。确保该系统成功的关键组成部分之一是设计具有高靶向切割效率和低脱靶效应的最佳单向导RNA(sgRNA)。由于需要考虑许多条件,并且对每个设计进行实证测试既耗时又昂贵,因此这具有挑战性。使用机器学习模型进行预测提供了高性能的替代方案。

结果

我们提出了CrisprBERT,这是一种深度学习模型,它结合了来自Transformer(BERT)架构的双向编码器表示,为配对的sgRNA和DNA序列提供高维嵌入,并结合双向长短期记忆网络进行学习,以仅利用sgRNA及其配对的DNA序列来预测sgRNA的脱靶效应。我们提出了双峰堆叠编码来捕获Cas9结合的局部能量配置,并应用BERT模型来学习双峰对的上下文嵌入。我们的结果表明,在单分割和留一sgRNA交叉验证以及独立测试方面,新模型比现有最先进的深度学习模型表现更好。

可用性和实现

CrisprBERT可在GitHub上获取:https://github.com/OSsari/CrisprBERT。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8239/11696696/3aecac8272cf/vbae184f7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8239/11696696/a92544013bfb/vbae184f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8239/11696696/aefcaebe8357/vbae184f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8239/11696696/40b4a6f2d0b0/vbae184f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8239/11696696/35f03644675c/vbae184f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8239/11696696/88cee051741b/vbae184f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8239/11696696/184b663576f0/vbae184f6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8239/11696696/3aecac8272cf/vbae184f7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8239/11696696/a92544013bfb/vbae184f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8239/11696696/aefcaebe8357/vbae184f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8239/11696696/40b4a6f2d0b0/vbae184f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8239/11696696/35f03644675c/vbae184f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8239/11696696/88cee051741b/vbae184f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8239/11696696/184b663576f0/vbae184f6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8239/11696696/3aecac8272cf/vbae184f7.jpg

相似文献

1
Predicting CRISPR-Cas9 off-target effects in human primary cells using bidirectional LSTM with BERT embedding.使用具有BERT嵌入的双向长短期记忆网络预测人类原代细胞中的CRISPR-Cas9脱靶效应。
Bioinform Adv. 2024 Dec 30;5(1):vbae184. doi: 10.1093/bioadv/vbae184. eCollection 2025.
2
Interpretable CRISPR/Cas9 off-target activities with mismatches and indels prediction using BERT.使用 BERT 预测错配和插入/缺失对可解释的 CRISPR/Cas9 脱靶活性。
Comput Biol Med. 2024 Feb;169:107932. doi: 10.1016/j.compbiomed.2024.107932. Epub 2024 Jan 1.
3
Effective use of sequence information to predict CRISPR-Cas9 off-target.有效利用序列信息预测CRISPR-Cas9脱靶效应。
Comput Struct Biotechnol J. 2022 Jan 19;20:650-661. doi: 10.1016/j.csbj.2022.01.006. eCollection 2022.
4
Using traditional machine learning and deep learning methods for on- and off-target prediction in CRISPR/Cas9: a review.基于传统机器学习和深度学习方法的 CRISPR/Cas9 脱靶和靶标预测:综述。
Brief Bioinform. 2023 May 19;24(3). doi: 10.1093/bib/bbad131.
5
Generalizable sgRNA design for improved CRISPR/Cas9 editing efficiency.可推广的 sgRNA 设计可提高 CRISPR/Cas9 编辑效率。
Bioinformatics. 2020 May 1;36(9):2684-2689. doi: 10.1093/bioinformatics/btaa041.
6
Prediction of CRISPR/Cas9 single guide RNA cleavage efficiency and specificity by attention-based convolutional neural networks.基于注意力机制的卷积神经网络预测CRISPR/Cas9单向导RNA切割效率和特异性
Comput Struct Biotechnol J. 2021 Mar 7;19:1445-1457. doi: 10.1016/j.csbj.2021.03.001. eCollection 2021.
7
Dual sgRNA-directed knockout gene expression using CRISPR/Cas9 technology for editing gene in triple-negative breast cancer.使用CRISPR/Cas9技术进行双sgRNA引导的基因敲除以编辑三阴性乳腺癌中的基因。
Narra J. 2024 Dec;4(3):e1177. doi: 10.52225/narra.v4i3.1177. Epub 2024 Nov 16.
8
Benchmarking deep learning methods for predicting CRISPR/Cas9 sgRNA on- and off-target activities.基于深度学习方法的 CRISPR/Cas9 sgRNA 靶标和脱靶活性预测基准测试
Brief Bioinform. 2023 Sep 22;24(6). doi: 10.1093/bib/bbad333.
9
Learning to quantify uncertainty in off-target activity for CRISPR guide RNAs.学习量化 CRISPR 引导 RNA 脱靶活性的不确定性。
Nucleic Acids Res. 2024 Oct 14;52(18):e87. doi: 10.1093/nar/gkae759.
10
CRISPR multitargeter: a web tool to find common and unique CRISPR single guide RNA targets in a set of similar sequences.CRISPR多靶点定位工具:一种在一组相似序列中查找常见和独特CRISPR单向导RNA靶点的网络工具。
PLoS One. 2015 Mar 5;10(3):e0119372. doi: 10.1371/journal.pone.0119372. eCollection 2015.

引用本文的文献

1
From Code to Life: The AI-Driven Revolution in Genome Editing.从代码到生命:基因组编辑中的人工智能驱动革命
Adv Sci (Weinh). 2025 Aug;12(30):e17029. doi: 10.1002/advs.202417029. Epub 2025 Jun 19.
2
Large Language Models in Genomics-A Perspective on Personalized Medicine.基因组学中的大语言模型——个性化医疗视角
Bioengineering (Basel). 2025 Apr 23;12(5):440. doi: 10.3390/bioengineering12050440.

本文引用的文献

1
Benchmarking deep learning methods for predicting CRISPR/Cas9 sgRNA on- and off-target activities.基于深度学习方法的 CRISPR/Cas9 sgRNA 靶标和脱靶活性预测基准测试
Brief Bioinform. 2023 Sep 22;24(6). doi: 10.1093/bib/bbad333.
2
Using traditional machine learning and deep learning methods for on- and off-target prediction in CRISPR/Cas9: a review.基于传统机器学习和深度学习方法的 CRISPR/Cas9 脱靶和靶标预测:综述。
Brief Bioinform. 2023 May 19;24(3). doi: 10.1093/bib/bbad131.
3
Effective use of sequence information to predict CRISPR-Cas9 off-target.
有效利用序列信息预测CRISPR-Cas9脱靶效应。
Comput Struct Biotechnol J. 2022 Jan 19;20:650-661. doi: 10.1016/j.csbj.2022.01.006. eCollection 2022.
4
Defining genome-wide CRISPR-Cas genome-editing nuclease activity with GUIDE-seq.利用 GUIDE-seq 定义全基因组范围的 CRISPR-Cas 基因组编辑核酸酶活性。
Nat Protoc. 2021 Dec;16(12):5592-5615. doi: 10.1038/s41596-021-00626-x. Epub 2021 Nov 12.
5
Enhancing CRISPR-Cas9 gRNA efficiency prediction by data integration and deep learning.通过数据集成和深度学习提高 CRISPR-Cas9 gRNA 效率预测。
Nat Commun. 2021 May 28;12(1):3238. doi: 10.1038/s41467-021-23576-0.
6
Accurate deep learning off-target prediction with novel sgRNA-DNA sequence encoding in CRISPR-Cas9 gene editing.利用CRISPR-Cas9基因编辑中的新型sgRNA-DNA序列编码进行准确的深度学习脱靶预测。
Bioinformatics. 2021 Aug 25;37(16):2299-2307. doi: 10.1093/bioinformatics/btab112.
7
Identifying genome-wide off-target sites of CRISPR RNA-guided nucleases and deaminases with Digenome-seq.利用 Digenome-seq 鉴定 CRISPR RNA 引导的核酸酶和脱氨酶的全基因组脱靶位点。
Nat Protoc. 2021 Feb;16(2):1170-1192. doi: 10.1038/s41596-020-00453-6. Epub 2021 Jan 18.
8
Amplification-free long-read sequencing reveals unforeseen CRISPR-Cas9 off-target activity.无扩增长读测序揭示了意想不到的 CRISPR-Cas9 脱靶活性。
Genome Biol. 2020 Dec 1;21(1):290. doi: 10.1186/s13059-020-02206-w.
9
Benchmarking and integrating genome-wide CRISPR off-target detection and prediction.基因组范围的 CRISPR 脱靶检测和预测的基准测试和整合。
Nucleic Acids Res. 2020 Nov 18;48(20):11370-11379. doi: 10.1093/nar/gkaa930.
10
CHANGE-seq reveals genetic and epigenetic effects on CRISPR-Cas9 genome-wide activity.CHANGE-seq 揭示了遗传和表观遗传对 CRISPR-Cas9 全基因组活性的影响。
Nat Biotechnol. 2020 Nov;38(11):1317-1327. doi: 10.1038/s41587-020-0555-7. Epub 2020 Jun 15.