Suppr超能文献

Be-dataHIVE:碱基编辑数据库。

Be-dataHIVE: a base editing database.

机构信息

Department of Computer Science, University of Oxford, Parks Road, Oxford, OX1 3QD, UK.

出版信息

BMC Bioinformatics. 2024 Oct 15;25(1):330. doi: 10.1186/s12859-024-05898-0.

Abstract

Base editing is an enhanced gene editing approach that enables the precise transformation of single nucleotides and has the potential to cure rare diseases. The design process of base editors is labour-intensive and outcomes are not easily predictable. For any clinical use, base editing has to be accurate and efficient. Thus, any bystander mutations have to be minimized. In recent years, computational models to predict base editing outcomes have been developed. However, the overall robustness and performance of those models is limited. One way to improve the performance is to train models on a diverse, feature-rich, and large dataset, which does not exist for the base editing field. Hence, we develop BE-dataHIVE, a mySQL database that covers over 460,000 gRNA target combinations. The current version of BE-dataHIVE consists of data from five studies and is enriched with melting temperatures and energy terms. Furthermore, multiple different data structures for machine learning were computed and are directly available. The database can be accessed via our website https://be-datahive.com/ or API and is therefore suitable for practitioners and machine learning researchers.

摘要

碱基编辑是一种增强型基因编辑方法,可实现单核苷酸的精确转换,有潜力治疗罕见病。碱基编辑器的设计过程非常繁琐,结果也不容易预测。要将碱基编辑用于临床,就必须保证其准确性和高效性,因此必须尽量减少任何旁观者突变。近年来,已经开发出用于预测碱基编辑结果的计算模型,但这些模型的整体稳健性和性能有限。提高性能的一种方法是在多样化、特征丰富且大型的数据集上训练模型,但碱基编辑领域并不存在这样的数据集。因此,我们开发了 BE-dataHIVE,这是一个 MySQL 数据库,涵盖了超过 46 万个 gRNA 靶标组合。BE-dataHIVE 的当前版本包含来自五项研究的数据,并丰富了熔解温度和能量项。此外,还计算了多种不同的机器学习数据结构,并且可以直接使用。该数据库可以通过我们的网站 https://be-datahive.com/ 或 API 访问,因此适合从业者和机器学习研究人员使用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1b20/11476525/f34350f579f4/12859_2024_5898_Fig1_HTML.jpg

相似文献

1
Be-dataHIVE: a base editing database.
BMC Bioinformatics. 2024 Oct 15;25(1):330. doi: 10.1186/s12859-024-05898-0.
2
DeepCRISTL: deep transfer learning to predict CRISPR/Cas9 functional and endogenous on-target editing efficiency.
Bioinformatics. 2022 Jun 24;38(Suppl 1):i161-i168. doi: 10.1093/bioinformatics/btac218.
4
igRNA Prediction and Selection AI Models (igRNA-PS) for Bystander-less ABE Base Editing.
J Mol Biol. 2024 Sep 15;436(18):168714. doi: 10.1016/j.jmb.2024.168714. Epub 2024 Jul 17.
5
Determinants of Base Editing Outcomes from Target Library Analysis and Machine Learning.
Cell. 2020 Jul 23;182(2):463-480.e30. doi: 10.1016/j.cell.2020.05.037. Epub 2020 Jun 12.
6
Off-Target Editing by CRISPR-Guided DNA Base Editors.
Biochemistry. 2019 Sep 10;58(36):3727-3734. doi: 10.1021/acs.biochem.9b00573. Epub 2019 Aug 26.
7
Efficient C•G-to-G•C base editors developed using CRISPRi screens, target-library analysis, and machine learning.
Nat Biotechnol. 2021 Nov;39(11):1414-1425. doi: 10.1038/s41587-021-00938-z. Epub 2021 Jun 28.
8
CRISPR-Cas9 DNA Base-Editing and Prime-Editing.
Int J Mol Sci. 2020 Aug 28;21(17):6240. doi: 10.3390/ijms21176240.
9
Improving the Precision of Base Editing by Bubble Hairpin Single Guide RNA.
mBio. 2021 Apr 20;12(2):e00342-21. doi: 10.1128/mBio.00342-21.
10
Systematic Exploration of Optimized Base Editing gRNA Design and Pleiotropic Effects with BExplorer.
Genomics Proteomics Bioinformatics. 2023 Dec;21(6):1237-1245. doi: 10.1016/j.gpb.2022.06.005. Epub 2022 Jul 2.

本文引用的文献

1
piCRISPR: Physically informed deep learning models for CRISPR/Cas9 off-target cleavage prediction.
Artif Intell Life Sci. 2023 Dec;3:None. doi: 10.1016/j.ailsci.2023.100075.
4
Predicting base editing outcomes using position-specific sequence determinants.
Nucleic Acids Res. 2022 Apr 8;50(6):3551-3564. doi: 10.1093/nar/gkac161.
8
Efficient C•G-to-G•C base editors developed using CRISPRi screens, target-library analysis, and machine learning.
Nat Biotechnol. 2021 Nov;39(11):1414-1425. doi: 10.1038/s41587-021-00938-z. Epub 2021 Jun 28.
9
crisprSQL: a novel database platform for CRISPR/Cas off-target cleavage assays.
Nucleic Acids Res. 2021 Jan 8;49(D1):D855-D861. doi: 10.1093/nar/gkaa885.
10
Sequence-specific prediction of the efficiencies of adenine and cytosine base editors.
Nat Biotechnol. 2020 Sep;38(9):1037-1043. doi: 10.1038/s41587-020-0573-5. Epub 2020 Jul 6.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验