超越三肽：针对超大数据集的两步主动机器学习

Beyond Tripeptides Two-Step Active Machine Learning for Very Large Data sets.

作者信息

van Teijlingen Alexander, Tuttle Tell

机构信息

Department of Chemistry, University of Strathclyde, 295 Cathedral Street, Glasgow G1 1XL, U.K.

出版信息

J Chem Theory Comput. 2021 May 11;17(5):3221-3232. doi: 10.1021/acs.jctc.1c00159. Epub 2021 Apr 27.

DOI:10.1021/acs.jctc.1c00159

PMID:33904712

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8278388/

Abstract

Self-assembling peptide nanostructures have been shown to be of great importance in nature and have presented many promising applications, for example, in medicine as drug-delivery vehicles, biosensors, and antivirals. Being very promising candidates for the growing field of bottom-up manufacture of functional nanomaterials, previous work (Frederix, et al. 2011 and 2015) has screened all possible amino acid combinations for di- and tripeptides in search of such materials. However, the enormous complexity and variety of linear combinations of the 20 amino acids make exhaustive simulation of all combinations of tetrapeptides and above infeasible. Therefore, we have developed an active machine-learning method (also known as "iterative learning" and "evolutionary search method") which leverages a lower-resolution data set encompassing the whole search space and a just-in-time high-resolution data set which further analyzes those target peptides selected by the lower-resolution model. This model uses newly generated data upon each iteration to improve both lower- and higher-resolution models in the search for ideal candidates. Curation of the lower-resolution data set is explored as a method to control the selected candidates, based on criteria such as log . A major aim of this method is to produce the best results in the least computationally demanding way. This model has been developed to be broadly applicable to other search spaces with minor changes to the algorithm, allowing its use in other areas of research.

摘要

自组装肽纳米结构在自然界中已被证明具有重要意义，并展现出许多有前景的应用，例如在医学领域用作药物递送载体、生物传感器和抗病毒剂。作为自下而上制造功能性纳米材料这一不断发展的领域中非常有前景的候选者，先前的工作（弗雷德里克斯等人，2011年和2015年）已经筛选了二肽和三肽的所有可能氨基酸组合以寻找此类材料。然而，20种氨基酸的线性组合具有极大的复杂性和多样性，使得对四肽及以上所有组合进行详尽模拟变得不可行。因此，我们开发了一种主动机器学习方法（也称为“迭代学习”和“进化搜索方法”），该方法利用涵盖整个搜索空间的低分辨率数据集和即时高分辨率数据集，后者进一步分析由低分辨率模型选择的那些目标肽。该模型在每次迭代时使用新生成的数据来改进低分辨率和高分辨率模型，以寻找理想的候选者。基于诸如对数等标准，探索对低分辨率数据集进行筛选作为控制所选候选者的一种方法。此方法的一个主要目标是以计算要求最低的方式产生最佳结果。该模型经过开发，只需对算法进行微小更改就可广泛应用于其他搜索空间，从而可用于其他研究领域。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1f74/8278388/6a3ea852b606/ct1c00159_0002.jpg

相似文献

Beyond Tripeptides Two-Step Active Machine Learning for Very Large Data sets.

J Chem Theory Comput. 2021 May 11;17(5):3221-3232. doi: 10.1021/acs.jctc.1c00159. Epub 2021 Apr 27.

Self-assembly of amphiphilic tripeptides with sequence-dependent nanostructure.

Biomater Sci. 2017 Jul 25;5(8):1526-1530. doi: 10.1039/c7bm00304h.

Exploring the sequence space for (tri-)peptide self-assembly to design and discover new hydrogels.

Nat Chem. 2015 Jan;7(1):30-7. doi: 10.1038/nchem.2122. Epub 2014 Dec 8.

Aromatic Motifs Dictate Nanohelix Handedness of Tripeptides.

ACS Nano. 2018 Dec 26;12(12):12305-12314. doi: 10.1021/acsnano.8b06173. Epub 2018 Nov 26.

Iterative processes: a review of semi-supervised machine learning in rehabilitation science.

Disabil Rehabil Assist Technol. 2020 Jul;15(5):515-520. doi: 10.1080/17483107.2019.1604831. Epub 2019 Jul 8.

Sequence-Dependent Nanofiber Structures of Phenylalanine and Isoleucine Tripeptides.

Int J Mol Sci. 2020 Nov 10;21(22):8431. doi: 10.3390/ijms21228431.

PEP search in MyCompoundID: detection and identification of dipeptides and tripeptides using dimethyl labeling and hydrophilic interaction liquid chromatography tandem mass spectrometry.

Anal Chem. 2014 Apr 1;86(7):3568-74. doi: 10.1021/ac500109y. Epub 2014 Mar 17.

Translational Metabolomics of Head Injury: Exploring Dysfunctional Cerebral Metabolism with Ex Vivo NMR Spectroscopy-Based Metabolite Quantification

Structural Information-Based Method for the Efficient and Reliable Prediction of Oligopeptide Conformations.

J Phys Chem B. 2017 Mar 30;121(12):2525-2533. doi: 10.1021/acs.jpcb.6b12415. Epub 2017 Mar 16.

A Drug-Target Network-Based Supervised Machine Learning Repurposing Method Allowing the Use of Multiple Heterogeneous Information Sources.

Methods Mol Biol. 2019;1903:281-289. doi: 10.1007/978-1-4939-8955-3_17.

引用本文的文献

Central position of histidine in the sequence of designed alternating polarity peptides enhances pH-responsive assembly with DNA.

BMC Biotechnol. 2025 Jul 1;25(1):54. doi: 10.1186/s12896-025-00976-4.

Discovery of unconventional and nonintuitive self-assembling peptide materials using experiment-driven machine learning.

Sci Adv. 2025 Jun 13;11(24):eadt9466. doi: 10.1126/sciadv.adt9466. Epub 2025 Jun 11.

Learning the rules of peptide self-assembly through data mining with large language models.

Sci Adv. 2025 Mar 28;11(13):eadv1971. doi: 10.1126/sciadv.adv1971. Epub 2025 Mar 26.

Aggregation Rules of Short Peptides.

JACS Au. 2024 Sep 3;4(9):3567-3580. doi: 10.1021/jacsau.4c00501. eCollection 2024 Sep 23.

Tips and Tricks in the Modeling of Supramolecular Peptide Assemblies.

ACS Omega. 2024 Jul 8;9(29):31254-31273. doi: 10.1021/acsomega.4c02628. eCollection 2024 Jul 23.

Assessment of the MARTINI 3 Performance for Short Peptide Self-Assembly.

J Chem Theory Comput. 2024 Jan 9;20(1):224-238. doi: 10.1021/acs.jctc.3c01015. Epub 2023 Dec 19.

Multiscale Simulations to Discover Self-Assembled Oligopeptides: A Benchmarking Study.

J Chem Theory Comput. 2024 Jan 9;20(1):375-384. doi: 10.1021/acs.jctc.3c00699. Epub 2023 Nov 28.

Deep Learning Empowers the Discovery of Self-Assembling Peptides with Over 10 Trillion Sequences.

Adv Sci (Weinh). 2023 Nov;10(31):e2301544. doi: 10.1002/advs.202301544. Epub 2023 Sep 25.

Short Peptide Self-Assembly in the Martini Coarse-Grain Force Field Family.

Acc Chem Res. 2023 Mar 21;56(6):644-654. doi: 10.1021/acs.accounts.2c00810. Epub 2023 Mar 3.

Constant pH Coarse-Grained Molecular Dynamics with Stochastic Charge Neutralization.

J Phys Chem Lett. 2022 May 12;13(18):4046-4051. doi: 10.1021/acs.jpclett.2c00544. Epub 2022 Apr 29.

本文引用的文献

Discovery of Self-Assembling π-Conjugated Peptides by Active Learning-Directed Coarse-Grained Molecular Simulation.

J Phys Chem B. 2020 May 14;124(19):3873-3891. doi: 10.1021/acs.jpcb.0c00708. Epub 2020 Mar 30.

Injectable self-assembled bola-dipeptide hydrogels for sustained photodynamic prodrug delivery and enhanced tumor therapy.

J Control Release. 2020 Mar 10;319:344-351. doi: 10.1016/j.jconrel.2020.01.002. Epub 2020 Jan 7.

Toward insights on determining factors for high activity in antimicrobial peptides via machine learning.

PeerJ. 2019 Dec 20;7:e8265. doi: 10.7717/peerj.8265. eCollection 2019.

DeepHLApan: A Deep Learning Approach for Neoantigen Prediction Considering Both HLA-Peptide Binding and Immunogenicity.

Front Immunol. 2019 Nov 1;10:2559. doi: 10.3389/fimmu.2019.02559. eCollection 2019.

Effect of self-assembly on antimicrobial activity of double-chain short cationic lipopeptides.

Bioorg Med Chem. 2019 Dec 1;27(23):115129. doi: 10.1016/j.bmc.2019.115129. Epub 2019 Oct 17.

Strategy to Identify Improved N-Terminal Modifications for Supramolecular Phenylalanine-Derived Hydrogelators.

Langmuir. 2019 Nov 19;35(46):14939-14948. doi: 10.1021/acs.langmuir.9b02971. Epub 2019 Nov 8.

Induction of p73, Δ133p53, Δ160p53, pAKT lead to neuroprotection via DNA repair by 5-LOX inhibition.

Mol Biol Rep. 2020 Jan;47(1):269-274. doi: 10.1007/s11033-019-05127-5. Epub 2019 Oct 28.

Design of self-assembly dipeptide hydrogels and machine learning via their chemical features.

Proc Natl Acad Sci U S A. 2019 Jun 4;116(23):11259-11264. doi: 10.1073/pnas.1903376116. Epub 2019 May 20.

Doxorubicin-reinforced supramolecular hydrogels of RGD-derived peptide conjugates for pH-responsive drug delivery.

Org Biomol Chem. 2019 Apr 10;17(15):3853-3860. doi: 10.1039/c9ob00046a.

Supramolecular Tripeptide Hydrogel Assembly with 5-Fluorouracil.

Gels. 2019 Jan 26;5(1):5. doi: 10.3390/gels5010005.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

超越三肽：针对超大数据集的两步主动机器学习

Beyond Tripeptides Two-Step Active Machine Learning for Very Large Data sets.

作者信息

van Teijlingen Alexander, Tuttle Tell

机构信息

Department of Chemistry, University of Strathclyde, 295 Cathedral Street, Glasgow G1 1XL, U.K.

出版信息

J Chem Theory Comput. 2021 May 11;17(5):3221-3232. doi: 10.1021/acs.jctc.1c00159. Epub 2021 Apr 27.

DOI:10.1021/acs.jctc.1c00159

PMID:33904712

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8278388/

Abstract

摘要

超越三肽：针对超大数据集的两步主动机器学习

Beyond Tripeptides Two-Step Active Machine Learning for Very Large Data sets.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

超越三肽：针对超大数据集的两步主动机器学习

Beyond Tripeptides Two-Step Active Machine Learning for Very Large Data sets.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献