• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

DNA k- -mer 的量子力学电子和几何参数作为机器学习的特征。

Quantum mechanical electronic and geometric parameters for DNA k-mers as features for machine learning.

机构信息

MRC WIMM Centre for Computational Biology, MRC Weatherall Institute of Molecular Medicine, Radcliffe Department of Medicine, University of Oxford, Oxford, OX3 9DS, UK.

出版信息

Sci Data. 2024 Aug 22;11(1):911. doi: 10.1038/s41597-024-03772-5.

DOI:10.1038/s41597-024-03772-5
PMID:39174574
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11341866/
Abstract

We are witnessing a steep increase in model development initiatives in genomics that employ high-end machine learning methodologies. Of particular interest are models that predict certain genomic characteristics based solely on DNA sequence. These models, however, treat the DNA as a mere collection of four, A, T, G and C, letters, dismissing the past advancements in science that can enable the use of more intricate information from nucleic acid sequences. Here, we provide a comprehensive database of quantum mechanical (QM) and geometric features for all the permutations of 7-meric DNA in their representative B, A and Z conformations. The database is generated by employing the applicable high-cost and time-consuming QM methodologies. This can thus make it seamless to associate a wealth of novel molecular features to any DNA sequence, by scanning it with a matching k-meric window and pulling the pre-computed values from our database for further use in modelling. We demonstrate the usefulness of our deposited features through their exclusive use in developing a model for A->C mutation rates.

摘要

我们正在见证基因组学中模型开发计划的急剧增加,这些计划采用了高端机器学习方法。特别有趣的是那些仅基于 DNA 序列预测某些基因组特征的模型。然而,这些模型将 DNA 仅仅视为 A、T、G 和 C 这四个字母的简单集合,忽略了过去在科学上的进步,这些进步可以利用核酸序列中更复杂的信息。在这里,我们提供了一个全面的量子力学(QM)和几何特征数据库,用于代表 B、A 和 Z 构象的所有 7 聚体 DNA 的排列。该数据库是通过应用高成本和耗时的 QM 方法生成的。因此,可以通过使用匹配的 k 聚体窗口扫描任何 DNA 序列,并从我们的数据库中提取预先计算的值,以便在建模中进一步使用,从而将丰富的新型分子特征无缝地关联到任何 DNA 序列。我们通过仅使用我们存储的特征来开发 A->C 突变率模型来证明其有用性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8232/11341866/5bf32c1926bf/41597_2024_3772_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8232/11341866/b8ba76b3da31/41597_2024_3772_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8232/11341866/e397b671ca01/41597_2024_3772_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8232/11341866/5bf32c1926bf/41597_2024_3772_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8232/11341866/b8ba76b3da31/41597_2024_3772_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8232/11341866/e397b671ca01/41597_2024_3772_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8232/11341866/5bf32c1926bf/41597_2024_3772_Fig3_HTML.jpg

相似文献

1
Quantum mechanical electronic and geometric parameters for DNA k-mers as features for machine learning.DNA k- -mer 的量子力学电子和几何参数作为机器学习的特征。
Sci Data. 2024 Aug 22;11(1):911. doi: 10.1038/s41597-024-03772-5.
2
Electronic and Nuclear Quantum Effects on Proton Transfer Reactions of Guanine-Thymine (G-T) Mispairs Using Combined Quantum Mechanical/Molecular Mechanical and Machine Learning Potentials.利用量子力学/分子力学和机器学习势能组合研究鸟嘌呤-胸腺嘧啶(G-T)错配质子转移反应的电子和核量子效应。
Molecules. 2024 Jun 6;29(11):2703. doi: 10.3390/molecules29112703.
3
Δ-Quantum machine-learning for medicinal chemistry.Δ-量子机器学习在药物化学中的应用。
Phys Chem Chem Phys. 2022 May 11;24(18):10775-10783. doi: 10.1039/d2cp00834c.
4
DeePKS + ABACUS as a Bridge between Expensive Quantum Mechanical Models and Machine Learning Potentials.深度势能表面(DeePKS)+自适应玻恩-奥本海默分子动力学模拟(ABACUS)作为昂贵的量子力学模型与机器学习势之间的桥梁。
J Phys Chem A. 2022 Dec 15;126(49):9154-9164. doi: 10.1021/acs.jpca.2c05000. Epub 2022 Dec 1.
5
Machine-Learning-Assisted Free Energy Simulation of Solution-Phase and Enzyme Reactions.基于机器学习的溶液相和酶反应自由能模拟。
J Chem Theory Comput. 2021 Sep 14;17(9):5745-5758. doi: 10.1021/acs.jctc.1c00565. Epub 2021 Sep 1.
6
Plasmer: an Accurate and Sensitive Bacterial Plasmid Prediction Tool Based on Machine Learning of Shared k-mers and Genomic Features.Plasmer:一种基于共享 k-mers 和基因组特征的机器学习的准确且灵敏的细菌质粒预测工具。
Microbiol Spectr. 2023 Jun 15;11(3):e0464522. doi: 10.1128/spectrum.04645-22. Epub 2023 May 16.
7
Solvation Free Energy Calculations with Quantum Mechanics/Molecular Mechanics and Machine Learning Models.溶剂化自由能的量子力学/分子力学和机器学习模型计算。
J Phys Chem B. 2019 Jan 31;123(4):901-908. doi: 10.1021/acs.jpcb.8b11905. Epub 2019 Jan 15.
8
Synergies Between Quantum Mechanics and Machine Learning in Reaction Prediction.量子力学与机器学习在反应预测中的协同作用
J Chem Inf Model. 2016 Nov 28;56(11):2125-2128. doi: 10.1021/acs.jcim.6b00351. Epub 2016 Oct 25.
9
Predicting Molecular Photochemistry Using Machine-Learning-Enhanced Quantum Dynamics Simulations.使用机器学习增强的量子动力学模拟预测分子光化学。
Acc Chem Res. 2022 Jan 18;55(2):209-220. doi: 10.1021/acs.accounts.1c00665. Epub 2022 Jan 4.
10
Machine Learning Quantum Mechanical/Molecular Mechanical Potentials: Evaluating Transferability in Dihydrofolate Reductase-Catalyzed Reactions.机器学习量子力学/分子力学势能:评估二氢叶酸还原酶催化反应中的可转移性
J Chem Theory Comput. 2025 Jan 28;21(2):817-832. doi: 10.1021/acs.jctc.4c01487. Epub 2025 Jan 15.

本文引用的文献

1
Evaluating deep learning for predicting epigenomic profiles.评估用于预测表观基因组图谱的深度学习。
Nat Mach Intell. 2022 Dec;4(12):1088-1100. doi: 10.1038/s42256-022-00570-9. Epub 2022 Dec 5.
2
RedDB, a computational database of electroactive molecules for aqueous redox flow batteries.RedDB,一种用于水系氧化还原液流电池的电活性分子计算数据库。
Sci Data. 2022 Nov 28;9(1):718. doi: 10.1038/s41597-022-01832-2.
3
A representation-independent electronic charge density database for crystalline materials.一个用于晶体材料的与表示无关的电子电荷密度数据库。
Sci Data. 2022 Oct 28;9(1):661. doi: 10.1038/s41597-022-01746-z.
4
QMugs, quantum mechanical properties of drug-like molecules.QMugs,类药物分子的量子力学性质。
Sci Data. 2022 Jun 7;9(1):273. doi: 10.1038/s41597-022-01390-7.
5
GEOM, energy-annotated molecular conformations for property prediction and molecular generation.GEOM,带能量注释的分子构象,用于性质预测和分子生成。
Sci Data. 2022 Apr 21;9(1):185. doi: 10.1038/s41597-022-01288-4.
6
AB-DB: Force-Field parameters, MD trajectories, QM-based data, and Descriptors of Antimicrobials.AB-DB:力场参数、MD 轨迹、基于 QM 的数据和抗菌剂描述符。
Sci Data. 2022 Apr 1;9(1):148. doi: 10.1038/s41597-022-01261-1.
7
Pushing the frontiers of density functionals by solving the fractional electron problem.通过求解分数电子问题推动密度泛函的前沿。
Science. 2021 Dec 10;374(6573):1385-1389. doi: 10.1126/science.abj6511. Epub 2021 Dec 9.
8
Effective gene expression prediction from sequence by integrating long-range interactions.通过整合长程相互作用,从序列中有效预测基因表达。
Nat Methods. 2021 Oct;18(10):1196-1203. doi: 10.1038/s41592-021-01252-x. Epub 2021 Oct 4.
9
Highly accurate protein structure prediction with AlphaFold.利用 AlphaFold 进行高精度蛋白质结构预测。
Nature. 2021 Aug;596(7873):583-589. doi: 10.1038/s41586-021-03819-2. Epub 2021 Jul 15.
10
Deep neural networks identify sequence context features predictive of transcription factor binding.深度神经网络可识别预测转录因子结合的序列上下文特征。
Nat Mach Intell. 2021 Feb;3(2):172-180. doi: 10.1038/s42256-020-00282-y. Epub 2021 Jan 18.