• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于规范和非规范氨基酸的分子动力学(MD)衍生特征

Molecular Dynamics (MD)-Derived Features for Canonical and Noncanonical Amino Acids.

作者信息

Hui Tiffani, Secor Maxim, Ho Minh Ngoc, Bayaraa Nomindari, Lin Yu-Shan

机构信息

Department of Chemistry, Tufts University, Medford, Massachusetts 02155, United States.

出版信息

J Chem Inf Model. 2025 Feb 24;65(4):1837-1849. doi: 10.1021/acs.jcim.4c02102. Epub 2025 Feb 2.

DOI:10.1021/acs.jcim.4c02102
PMID:39895111
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11863381/
Abstract

Machine learning (ML) models have become increasingly popular for predicting and designing structures and properties of peptides and proteins. These ML models typically use peptides and proteins containing only canonical amino acids as the training data. Consequently, these models struggle to make accurate predictions for peptides and proteins containing new amino acids that are absent in the training data set (, noncanonical amino acids). One approach to improve the accuracy of the models is to collect more training data with the desired amino acids. However, this strategy is suboptimal as new data may not be easily attainable, and additional time is required to retrain the ML models. Alternatively, the extendibility of the ML models can be improved if the amino acid features used are representative and generalizable to the unseen amino acids. Herein, we develop amino acid features using molecular dynamics (MD) simulation results. Specifically, for a given amino acid, we perform MD simulation of its dipeptide to create features based on its backbone (ϕ, ψ) distributions and its electrostatic potentials. We demonstrate that these new features enable our ML models to more accurately predict the structural ensembles of cyclic peptides containing amino acids not present in the original training data set. For example, we build ML models to predict cyclic pentapeptide structures, with the training data set containing a library of 15 amino acids and the test data set containing the same 15-amino-acid library or an extended 50-amino-acid library. When using popular features such as Morgan fingerprints and MACCS keys to represent amino acids, the ML models achieve = 0.963 for structural predictions of test cyclic pentapeptides containing the same 15-amino-acid library. However, these models' performances decrease significantly to = 0.430 and = 0.508, respectively, when tasked to predict the structures of cyclic pentapeptides containing a library of 50 amino acids. On the other hand, the model using our backbone (ϕ, ψ) features outperforms those using Morgan fingerprints and MACCS keys, with = 0.700. Overall, instead of having to collect more training data, our new features enable predictions of peptide sequences containing amino acids not originally present in the training data set at the mere cost of performing new dipeptide simulations for the new amino acids.

摘要

机器学习(ML)模型在预测和设计肽及蛋白质的结构与性质方面越来越受欢迎。这些ML模型通常使用仅包含标准氨基酸的肽和蛋白质作为训练数据。因此,对于含有训练数据集中不存在的新氨基酸(即非标准氨基酸)的肽和蛋白质,这些模型难以做出准确预测。提高模型准确性的一种方法是收集更多含有目标氨基酸的训练数据。然而,这种策略并不理想,因为新数据可能不容易获得,且需要额外时间重新训练ML模型。或者,如果所使用的氨基酸特征具有代表性且能推广到未见过的氨基酸,则可以提高ML模型的可扩展性。在此,我们利用分子动力学(MD)模拟结果开发氨基酸特征。具体而言,对于给定的氨基酸,我们对其二肽进行MD模拟,以基于其主链(ϕ,ψ)分布及其静电势创建特征。我们证明,这些新特征使我们的ML模型能够更准确地预测含有原始训练数据集中不存在的氨基酸的环肽的结构集合。例如,我们构建ML模型来预测环五肽结构,训练数据集包含一个15种氨基酸的文库,测试数据集包含相同的15种氨基酸文库或扩展的50种氨基酸文库。当使用摩根指纹和MACCS键等常用特征来表示氨基酸时,ML模型对含有相同15种氨基酸文库的测试环五肽的结构预测得分为=0.963。然而,当任务是预测含有50种氨基酸文库的环五肽结构时,这些模型的性能分别显著下降至=0.430和=0.508。另一方面,使用我们的主链(ϕ,ψ)特征的模型优于使用摩根指纹和MACCS键的模型,得分为=0.700。总体而言,我们的新特征无需收集更多训练数据,只需为新氨基酸进行新的二肽模拟,就能预测含有训练数据集中原本不存在的氨基酸的肽序列。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/71c4/11863381/c165c69e9762/ci4c02102_0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/71c4/11863381/317a5766fc4a/ci4c02102_0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/71c4/11863381/29040ab6c1e3/ci4c02102_0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/71c4/11863381/f1fada5ba77b/ci4c02102_0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/71c4/11863381/801d378c9846/ci4c02102_0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/71c4/11863381/c165c69e9762/ci4c02102_0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/71c4/11863381/317a5766fc4a/ci4c02102_0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/71c4/11863381/29040ab6c1e3/ci4c02102_0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/71c4/11863381/f1fada5ba77b/ci4c02102_0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/71c4/11863381/801d378c9846/ci4c02102_0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/71c4/11863381/c165c69e9762/ci4c02102_0005.jpg

相似文献

1
Molecular Dynamics (MD)-Derived Features for Canonical and Noncanonical Amino Acids.用于规范和非规范氨基酸的分子动力学(MD)衍生特征
J Chem Inf Model. 2025 Feb 24;65(4):1837-1849. doi: 10.1021/acs.jcim.4c02102. Epub 2025 Feb 2.
2
A backbone-dependent rotamer library with high (ϕ, ψ) coverage using metadynamics simulations.基于元动力学模拟的具有高(ϕ,ψ)覆盖度的依赖于骨架的构象文库。
Protein Sci. 2022 Dec;31(12):e4491. doi: 10.1002/pro.4491.
3
Training Neural Network Models Using Molecular Dynamics Simulation Results to Efficiently Predict Cyclic Hexapeptide Structural Ensembles.使用分子动力学模拟结果训练神经网络模型,以高效预测环状六肽结构集合。
J Chem Theory Comput. 2023 Jul 25;19(14):4757-4769. doi: 10.1021/acs.jctc.3c00154. Epub 2023 May 26.
4
Raman Spectra of Amino Acids and Peptides from Machine Learning Polarizabilities.氨基酸和肽的拉曼光谱:来自机器学习极化率。
J Chem Inf Model. 2024 Jun 24;64(12):4601-4612. doi: 10.1021/acs.jcim.4c00077. Epub 2024 Jun 3.
5
Molecular dynamics-derived rotamer libraries for d-amino acids within homochiral and heterochiral polypeptides.用于同手性和异手性多肽中d-氨基酸的分子动力学衍生旋转异构体文库。
Protein Eng Des Sel. 2018 Jun 1;31(6):191-204. doi: 10.1093/protein/gzy016.
6
Structure prediction of cyclic peptides by molecular dynamics + machine learning.通过分子动力学+机器学习进行环肽的结构预测
Chem Sci. 2021 Nov 5;12(44):14927-14936. doi: 10.1039/d1sc05562c. eCollection 2021 Nov 17.
7
Influence of side chain conformations on local conformational features of amino acids and implication for force field development.侧链构象对氨基酸局部构象特征的影响及其对力场开发的意义。
J Phys Chem B. 2010 May 6;114(17):5840-50. doi: 10.1021/jp909088e.
8
Molecular dynamics simulations of peptides and proteins with a continuum electrostatic model based on screened Coulomb potentials.基于屏蔽库仑势的连续静电模型对肽和蛋白质的分子动力学模拟。
Proteins. 2003 Apr 1;51(1):109-25. doi: 10.1002/prot.10330.
9
ff19SB: Amino-Acid-Specific Protein Backbone Parameters Trained against Quantum Mechanics Energy Surfaces in Solution.ff19SB:针对溶液中量子力学能量面进行训练的氨基酸特异性蛋白质骨架参数。
J Chem Theory Comput. 2020 Jan 14;16(1):528-552. doi: 10.1021/acs.jctc.9b00591. Epub 2019 Dec 3.
10
Developments and Applications of Coil-Library-Based Residue-Specific Force Fields for Molecular Dynamics Simulations of Peptides and Proteins.基于线圈库的残基特异力场在肽和蛋白质的分子动力学模拟中的发展与应用。
J Chem Theory Comput. 2019 May 14;15(5):2761-2773. doi: 10.1021/acs.jctc.8b00794. Epub 2019 Apr 8.

引用本文的文献

1
Harnessing advanced computational approaches to design novel antimicrobial peptides against intracellular bacterial infections.利用先进的计算方法设计针对细胞内细菌感染的新型抗菌肽。
Bioact Mater. 2025 Apr 28;50:510-524. doi: 10.1016/j.bioactmat.2025.04.016. eCollection 2025 Aug.

本文引用的文献

1
A review of the clinical efficacy of FDA-approved antibody‒drug conjugates in human cancers.FDA 批准的抗体药物偶联物在人类癌症中的临床疗效评价。
Mol Cancer. 2024 Mar 23;23(1):62. doi: 10.1186/s12943-024-01963-7.
2
Cyclic Peptides for Drug Development.环状肽在药物研发中的应用
Angew Chem Int Ed Engl. 2024 Jan 15;63(3):e202308251. doi: 10.1002/anie.202308251. Epub 2023 Oct 23.
3
Cyclic Peptides in Pipeline: What Future for These Great Molecules?处于研发阶段的环肽:这些伟大分子的未来如何?
Pharmaceuticals (Basel). 2023 Jul 12;16(7):996. doi: 10.3390/ph16070996.
4
Training Neural Network Models Using Molecular Dynamics Simulation Results to Efficiently Predict Cyclic Hexapeptide Structural Ensembles.使用分子动力学模拟结果训练神经网络模型,以高效预测环状六肽结构集合。
J Chem Theory Comput. 2023 Jul 25;19(14):4757-4769. doi: 10.1021/acs.jctc.3c00154. Epub 2023 May 26.
5
Transformer-based deep learning for predicting protein properties in the life sciences.基于 Transformer 的深度学习在生命科学中预测蛋白质性质。
Elife. 2023 Jan 18;12:e82819. doi: 10.7554/eLife.82819.
6
Peptide-to-Small Molecule: A Pharmacophore-Guided Small Molecule Lead Generation Strategy from High-Affinity Macrocyclic Peptides.从高亲和力的大环肽到小分子:基于药效团的小分子先导化合物生成策略。
J Med Chem. 2022 Aug 11;65(15):10655-10673. doi: 10.1021/acs.jmedchem.2c00919. Epub 2022 Jul 29.
7
Deep generative models for peptide design.用于肽设计的深度生成模型。
Digit Discov. 2022 Mar 31;1(3):195-208. doi: 10.1039/d1dd00024a. eCollection 2022 Jun 13.
8
100th Anniversary of Macromolecular Science Viewpoint: Data-Driven Protein Design.高分子科学观点100周年:数据驱动的蛋白质设计
ACS Macro Lett. 2021 Mar 16;10(3):327-340. doi: 10.1021/acsmacrolett.0c00885. Epub 2021 Feb 8.
9
A New Amino Acid for Improving Permeability and Solubility in Macrocyclic Peptides through Side Chain-to-Backbone Hydrogen Bonding.通过侧链-主链氢键提高大环肽渗透性和溶解度的新型氨基酸。
J Med Chem. 2022 Mar 24;65(6):5072-5084. doi: 10.1021/acs.jmedchem.2c00010. Epub 2022 Mar 11.
10
Therapeutic peptides: current applications and future directions.治疗性肽:当前的应用及未来方向。
Signal Transduct Target Ther. 2022 Feb 14;7(1):48. doi: 10.1038/s41392-022-00904-4.