通过使用深度神经网络捕捉化学家的直觉进行分子优化。

Molecular optimization by capturing chemist's intuition using deep neural networks.

作者信息

He Jiazhen, You Huifang, Sandström Emil, Nittinger Eva, Bjerrum Esben Jannik, Tyrchan Christian, Czechtizky Werngard, Engkvist Ola

机构信息

Discovery Sciences, R&D, AstraZeneca, Gothenburg, Sweden.

Department of Pharmaceutical Biosciences, Uppsala University, Uppsala, Sweden.

出版信息

J Cheminform. 2021 Mar 20;13(1):26. doi: 10.1186/s13321-021-00497-0.

DOI:10.1186/s13321-021-00497-0

PMID:33743817

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7980633/

Abstract

A main challenge in drug discovery is finding molecules with a desirable balance of multiple properties. Here, we focus on the task of molecular optimization, where the goal is to optimize a given starting molecule towards desirable properties. This task can be framed as a machine translation problem in natural language processing, where in our case, a molecule is translated into a molecule with optimized properties based on the SMILES representation. Typically, chemists would use their intuition to suggest chemical transformations for the starting molecule being optimized. A widely used strategy is the concept of matched molecular pairs where two molecules differ by a single transformation. We seek to capture the chemist's intuition from matched molecular pairs using machine translation models. Specifically, the sequence-to-sequence model with attention mechanism, and the Transformer model are employed to generate molecules with desirable properties. As a proof of concept, three ADMET properties are optimized simultaneously: logD, solubility, and clearance, which are important properties of a drug. Since desirable properties often vary from project to project, the user-specified desirable property changes are incorporated into the input as an additional condition together with the starting molecules being optimized. Thus, the models can be guided to generate molecules satisfying the desirable properties. Additionally, we compare the two machine translation models based on the SMILES representation, with a graph-to-graph translation model HierG2G, which has shown the state-of-the-art performance in molecular optimization. Our results show that the Transformer can generate more molecules with desirable properties by making small modifications to the given starting molecules, which can be intuitive to chemists. A further enrichment of diverse molecules can be achieved by using an ensemble of models.

摘要

药物研发中的一个主要挑战是找到具有多种性质理想平衡的分子。在此，我们专注于分子优化任务，其目标是将给定的起始分子优化至具有理想性质。该任务可被构建为自然语言处理中的机器翻译问题，在我们的案例中，一个分子基于SMILES表示被翻译成具有优化性质的分子。通常，化学家会运用他们的直觉为正在优化的起始分子建议化学转化。一种广泛使用的策略是匹配分子对的概念，即两个分子仅相差一个转化。我们试图使用机器翻译模型从匹配分子对中捕捉化学家的直觉。具体而言，采用带有注意力机制的序列到序列模型以及Transformer模型来生成具有理想性质的分子。作为概念验证，同时优化了三个ADMET性质：logD、溶解度和清除率，这些都是药物的重要性质。由于理想性质往往因项目而异，用户指定的理想性质变化作为附加条件与正在优化的起始分子一起纳入输入。这样，模型就能被引导生成满足理想性质的分子。此外，我们基于SMILES表示将这两个机器翻译模型与图到图翻译模型HierG2G进行比较，HierG2G在分子优化方面已展现出最先进的性能。我们的结果表明，Transformer通过对给定的起始分子进行微小修改能够生成更多具有理想性质的分子，这对化学家来说可能是直观的。通过使用模型集成可以进一步实现多样分子的富集。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5af4/7980633/83c639be467c/13321_2021_497_Fig1_HTML.jpg

相似文献

Molecular optimization by capturing chemist's intuition using deep neural networks.

J Cheminform. 2021 Mar 20;13(1):26. doi: 10.1186/s13321-021-00497-0.

Transformer-based molecular optimization beyond matched molecular pairs.

J Cheminform. 2022 Mar 28;14(1):18. doi: 10.1186/s13321-022-00599-3.

Transformer-based deep learning method for optimizing ADMET properties of lead compounds.

Phys Chem Chem Phys. 2023 Jan 18;25(3):2377-2385. doi: 10.1039/d2cp05332b.

Evaluation of reinforcement learning in transformer-based molecular design.

J Cheminform. 2024 Aug 8;16(1):95. doi: 10.1186/s13321-024-00887-0.

FSM-DDTR: End-to-end feedback strategy for multi-objective De Novo drug design using transformers.

Comput Biol Med. 2023 Sep;164:107285. doi: 10.1016/j.compbiomed.2023.107285. Epub 2023 Jul 31.

Molecule generation using transformers and policy gradient reinforcement learning.

Sci Rep. 2023 May 31;13(1):8799. doi: 10.1038/s41598-023-35648-w.

Permutation Invariant Graph-to-Sequence Model for Template-Free Retrosynthesis and Reaction Prediction.

J Chem Inf Model. 2022 Aug 8;62(15):3503-3513. doi: 10.1021/acs.jcim.2c00321. Epub 2022 Jul 26.

DrugEx v3: scaffold-constrained drug design with graph transformer-based reinforcement learning.

J Cheminform. 2023 Feb 20;15(1):24. doi: 10.1186/s13321-023-00694-z.

Transformer neural network for protein-specific de novo drug generation as a machine translation problem.

Sci Rep. 2021 Jan 11;11(1):321. doi: 10.1038/s41598-020-79682-4.

Deep scaffold hopping with multimodal transformer neural networks.

J Cheminform. 2021 Nov 13;13(1):87. doi: 10.1186/s13321-021-00565-5.

引用本文的文献

A 3D pocket-aware lead optimization model with knowledge guidance and its application for discovery of new glutaminyl cyclase inhibitors.

Brief Bioinform. 2025 Jul 2;26(4). doi: 10.1093/bib/bbaf345.

A review of transformer models in drug discovery and beyond.

J Pharm Anal. 2025 Jun;15(6):101081. doi: 10.1016/j.jpha.2024.101081. Epub 2024 Aug 30.

Developing muscarinic receptor M1 classification models utilizing transfer learning and generative AI techniques.

Sci Rep. 2025 May 12;15(1):16486. doi: 10.1038/s41598-025-00972-w.

Sculpting molecules in text-3D space: a flexible substructure aware framework for text-oriented molecular optimization.

BMC Bioinformatics. 2025 May 7;26(1):123. doi: 10.1186/s12859-025-06072-w.

DrugAssist: a large language model for molecule optimization.

Brief Bioinform. 2024 Nov 22;26(1). doi: 10.1093/bib/bbae693.

Recent Applications of Artificial Intelligence in Discovery of New Antibacterial Agents.

Adv Appl Bioinform Chem. 2024 Dec 3;17:139-157. doi: 10.2147/AABC.S484321. eCollection 2024.

Exhaustive local chemical space exploration using a transformer model.

Nat Commun. 2024 Aug 25;15(1):7315. doi: 10.1038/s41467-024-51672-4.

Natural language processing with transformers: a review.

PeerJ Comput Sci. 2024 Aug 7;10:e2222. doi: 10.7717/peerj-cs.2222. eCollection 2024.

Evaluation of reinforcement learning in transformer-based molecular design.

J Cheminform. 2024 Aug 8;16(1):95. doi: 10.1186/s13321-024-00887-0.

Application progress of deep generative models in de novo drug design.

Mol Divers. 2024 Aug;28(4):2411-2427. doi: 10.1007/s11030-024-10942-5. Epub 2024 Aug 4.

本文引用的文献

Molecular Sets (MOSES): A Benchmarking Platform for Molecular Generation Models.

Front Pharmacol. 2020 Dec 18;11:565644. doi: 10.3389/fphar.2020.565644. eCollection 2020.

Efficient multi-objective molecular optimization in a continuous latent space.

Chem Sci. 2019 Jul 8;10(34):8016-8024. doi: 10.1039/c9sc01928f. eCollection 2019 Sep 14.

Molecular Transformer: A Model for Uncertainty-Calibrated Chemical Reaction Prediction.

ACS Cent Sci. 2019 Sep 25;5(9):1572-1583. doi: 10.1021/acscentsci.9b00576. Epub 2019 Aug 30.

Analyzing Learned Molecular Representations for Property Prediction.

J Chem Inf Model. 2019 Aug 26;59(8):3370-3388. doi: 10.1021/acs.jcim.9b00237. Epub 2019 Aug 13.

Optimization of Molecules via Deep Reinforcement Learning.

Sci Rep. 2019 Jul 24;9(1):10752. doi: 10.1038/s41598-019-47148-x.

GuacaMol: Benchmarking Models for de Novo Molecular Design.

J Chem Inf Model. 2019 Mar 25;59(3):1096-1108. doi: 10.1021/acs.jcim.8b00839. Epub 2019 Mar 19.

Multi-objective de novo drug design with conditional graph generative model.

J Cheminform. 2018 Jul 24;10(1):33. doi: 10.1186/s13321-018-0287-6.

Molecular generative model based on conditional variational autoencoder for de novo molecular design.

J Cheminform. 2018 Jul 11;10(1):31. doi: 10.1186/s13321-018-0286-7.

mmpdb: An Open-Source Matched Molecular Pair Platform for Large Multiproperty Data Sets.

J Chem Inf Model. 2018 May 29;58(5):902-910. doi: 10.1021/acs.jcim.8b00173. Epub 2018 May 17.

Reinforced Adversarial Neural Computer for de Novo Molecular Design.

J Chem Inf Model. 2018 Jun 25;58(6):1194-1204. doi: 10.1021/acs.jcim.7b00690. Epub 2018 Jun 12.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

通过使用深度神经网络捕捉化学家的直觉进行分子优化。

Molecular optimization by capturing chemist's intuition using deep neural networks.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献