FEP 增强作为解决化学生物学中机器学习数据匮乏问题的一种手段。

FEP Augmentation as a Means to Solve Data Paucity Problems for Machine Learning in Chemical Biology.

机构信息

Avicenna Biosciences Inc., 101 W. Chapel Hill Street, Suite 210, Durham, North Carolina 27001, United States.

Schrödinger, Inc., 120 West 45th Street, New York, New York 10036, United States.

出版信息

J Chem Inf Model. 2024 May 13;64(9):3812-3825. doi: 10.1021/acs.jcim.4c00071. Epub 2024 Apr 23.

DOI:10.1021/acs.jcim.4c00071

PMID:38651738

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11094716/

Abstract

In the realm of medicinal chemistry, the primary objective is to swiftly optimize a multitude of chemical properties of a set of compounds to yield a clinical candidate poised for clinical trials. In recent years, two computational techniques, machine learning (ML) and physics-based methods, have evolved substantially and are now frequently incorporated into the medicinal chemist's toolbox to enhance the efficiency of both hit optimization and candidate design. Both computational methods come with their own set of limitations, and they are often used independently of each other. ML's capability to screen extensive compound libraries expediently is tempered by its reliance on quality data, which can be scarce especially during early-stage optimization. Contrarily, physics-based approaches like free energy perturbation (FEP) are frequently constrained by low throughput and high cost by comparison; however, physics-based methods are capable of making highly accurate binding affinity predictions. In this study, we harnessed the strength of FEP to overcome data paucity in ML by generating virtual activity data sets which then inform the training of algorithms. Here, we show that ML algorithms trained with an FEP-augmented data set could achieve comparable predictive accuracy to data sets trained on experimental data from biological assays. Throughout the paper, we emphasize key mechanistic considerations that must be taken into account when aiming to augment data sets and lay the groundwork for successful implementation. Ultimately, the study advocates for the synergy of physics-based methods and ML to expedite the lead optimization process. We believe that the physics-based augmentation of ML will significantly benefit drug discovery, as these techniques continue to evolve.

摘要

在药物化学领域，主要目标是快速优化一组化合物的多种化学性质，以产生准备进行临床试验的临床候选药物。近年来，两种计算技术——机器学习 (ML) 和基于物理的方法——有了显著的发展，现在经常被纳入药物化学家的工具包中，以提高命中优化和候选物设计的效率。这两种计算方法都有其自身的局限性，而且它们通常是相互独立使用的。ML 能够快速筛选大量化合物库的能力受到其对高质量数据的依赖的限制，尤其是在早期优化阶段，高质量数据可能很稀缺。相比之下，自由能微扰 (FEP) 等基于物理的方法通常受到通量低和成本高的限制；然而，基于物理的方法能够进行高度准确的结合亲和力预测。在这项研究中，我们利用 FEP 的优势来克服 ML 中的数据匮乏问题，通过生成虚拟活性数据集来为算法的训练提供信息。在这里，我们表明，使用 FEP 增强数据集训练的 ML 算法可以达到与使用生物测定实验数据训练的数据集相当的预测准确性。在整篇论文中，我们强调了在目标是增强数据集时必须考虑的关键机制性考虑因素，并为成功实施奠定了基础。最终，该研究提倡将基于物理的方法和 ML 相结合，以加快先导优化过程。我们相信，基于物理的 ML 增强将极大地有益于药物发现，因为这些技术将继续发展。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/152b/11094716/5ad3648b584e/ci4c00071_0001.jpg

相似文献

FEP Augmentation as a Means to Solve Data Paucity Problems for Machine Learning in Chemical Biology.

J Chem Inf Model. 2024 May 13;64(9):3812-3825. doi: 10.1021/acs.jcim.4c00071. Epub 2024 Apr 23.

Error Tolerance of Machine Learning Algorithms across Contemporary Biological Targets.

Molecules. 2019 Jun 4;24(11):2115. doi: 10.3390/molecules24112115.

Reaction-Based Enumeration, Active Learning, and Free Energy Calculations To Rapidly Explore Synthetically Tractable Chemical Space and Optimize Potency of Cyclin-Dependent Kinase 2 Inhibitors.

J Chem Inf Model. 2019 Sep 23;59(9):3782-3793. doi: 10.1021/acs.jcim.9b00367. Epub 2019 Aug 22.

Drug Discovery in the Age of Artificial Intelligence: Transformative Target-Based Approaches.

Int J Mol Sci. 2024 Nov 14;25(22):12233. doi: 10.3390/ijms252212233.

The Art and Science of Molecular Docking.

Annu Rev Biochem. 2024 Aug;93(1):389-410. doi: 10.1146/annurev-biochem-030222-120000. Epub 2024 Jul 2.

Using physics-based pose predictions and free energy perturbation calculations to predict binding poses and relative binding affinities for FXR ligands in the D3R Grand Challenge 2.

J Comput Aided Mol Des. 2018 Jan;32(1):21-44. doi: 10.1007/s10822-017-0075-9. Epub 2017 Nov 8.

Relative Binding Free Energy Calculations in Drug Discovery: Recent Advances and Practical Considerations.

J Chem Inf Model. 2017 Dec 26;57(12):2911-2937. doi: 10.1021/acs.jcim.7b00564. Epub 2017 Dec 15.

Synergy and Complementarity between Focused Machine Learning and Physics-Based Simulation in Affinity Prediction.

J Chem Inf Model. 2021 Dec 27;61(12):5948-5966. doi: 10.1021/acs.jcim.1c01382. Epub 2021 Dec 10.

Protein-Ligand Binding Free Energy Calculations with FEP.

Methods Mol Biol. 2019;2022:201-232. doi: 10.1007/978-1-4939-9608-7_9.

Explainable Machine Learning for Property Predictions in Compound Optimization.

J Med Chem. 2021 Dec 23;64(24):17744-17752. doi: 10.1021/acs.jmedchem.1c01789. Epub 2021 Dec 13.

引用本文的文献

Active Learning FEP Using 3D-QSAR for Prioritizing Bioisosteres in Medicinal Chemistry.

ACS Med Chem Lett. 2025 Apr 29;16(6):984-990. doi: 10.1021/acsmedchemlett.4c00554. eCollection 2025 Jun 12.

How does machine learning augment alchemical binding free energy calculations?

Future Med Chem. 2025 Mar;17(5):509-511. doi: 10.1080/17568919.2025.2463870. Epub 2025 Feb 8.

本文引用的文献

Reconstructing Kinetic Models for Dynamical Studies of Metabolism using Generative Adversarial Networks.

Nat Mach Intell. 2022;4(8):710-719. doi: 10.1038/s42256-022-00519-y. Epub 2022 Aug 30.

Chemical Space Exploration with Active Learning and Alchemical Free Energies.

J Chem Theory Comput. 2022 Oct 11;18(10):6259-6270. doi: 10.1021/acs.jctc.2c00752. Epub 2022 Sep 23.

Combining Machine Learning and Computational Chemistry for Predictive Insights Into Chemical Systems.

Chem Rev. 2021 Aug 25;121(16):9816-9872. doi: 10.1021/acs.chemrev.1c00107. Epub 2021 Jul 7.

Comparative study between deep learning and QSAR classifications for TNBC inhibitors and novel GPCR agonist discovery.

Sci Rep. 2020 Oct 8;10(1):16771. doi: 10.1038/s41598-020-73681-1.

c-Src and EGFR Inhibition in Molecular Cancer Therapy: What Else Can We Improve?

Cancers (Basel). 2020 Jun 7;12(6):1489. doi: 10.3390/cancers12061489.

Fundamental aspects of DMPK optimization of targeted protein degraders.

Drug Discov Today. 2020 Jun;25(6):969-982. doi: 10.1016/j.drudis.2020.03.012. Epub 2020 Apr 13.

Current and Future Roles of Artificial Intelligence in Medicinal Chemistry Synthesis.

J Med Chem. 2020 Aug 27;63(16):8667-8682. doi: 10.1021/acs.jmedchem.9b02120. Epub 2020 Apr 14.

Late-stage oxidative C(sp)-H methylation.

Nature. 2020 Apr;580(7805):621-627. doi: 10.1038/s41586-020-2137-8. Epub 2020 Mar 16.

Methods for Design of Kinase Inhibitors as Anticancer Drugs.

Front Chem. 2020 Jan 8;7:873. doi: 10.3389/fchem.2019.00873. eCollection 2019.

Reaction-Based Enumeration, Active Learning, and Free Energy Calculations To Rapidly Explore Synthetically Tractable Chemical Space and Optimize Potency of Cyclin-Dependent Kinase 2 Inhibitors.

J Chem Inf Model. 2019 Sep 23;59(9):3782-3793. doi: 10.1021/acs.jcim.9b00367. Epub 2019 Aug 22.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

FEP 增强作为解决化学生物学中机器学习数据匮乏问题的一种手段。

FEP Augmentation as a Means to Solve Data Paucity Problems for Machine Learning in Chemical Biology.

机构信息

Avicenna Biosciences Inc., 101 W. Chapel Hill Street, Suite 210, Durham, North Carolina 27001, United States.

Schrödinger, Inc., 120 West 45th Street, New York, New York 10036, United States.

出版信息

J Chem Inf Model. 2024 May 13;64(9):3812-3825. doi: 10.1021/acs.jcim.4c00071. Epub 2024 Apr 23.

DOI:10.1021/acs.jcim.4c00071

PMID:38651738

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11094716/

Abstract

摘要

FEP 增强作为解决化学生物学中机器学习数据匮乏问题的一种手段。

FEP Augmentation as a Means to Solve Data Paucity Problems for Machine Learning in Chemical Biology.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

FEP 增强作为解决化学生物学中机器学习数据匮乏问题的一种手段。

FEP Augmentation as a Means to Solve Data Paucity Problems for Machine Learning in Chemical Biology.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献