超越匹配分子对的基于Transformer的分子优化。

Transformer-based molecular optimization beyond matched molecular pairs.

作者信息

He Jiazhen, Nittinger Eva, Tyrchan Christian, Czechtizky Werngard, Patronov Atanas, Bjerrum Esben Jannik, Engkvist Ola

机构信息

Molecular AI, Discovery Sciences, R&D, AstraZeneca, Gothenburg, Sweden.

Medicinal Chemistry, Research and Early Development, Respiratory and Immunology (R&I), BioPharmaceuticals R&D, AstraZeneca, Gothenburg, Sweden.

出版信息

J Cheminform. 2022 Mar 28;14(1):18. doi: 10.1186/s13321-022-00599-3.

DOI:10.1186/s13321-022-00599-3

PMID:35346368

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8962145/

Abstract

Molecular optimization aims to improve the drug profile of a starting molecule. It is a fundamental problem in drug discovery but challenging due to (i) the requirement of simultaneous optimization of multiple properties and (ii) the large chemical space to explore. Recently, deep learning methods have been proposed to solve this task by mimicking the chemist's intuition in terms of matched molecular pairs (MMPs). Although MMPs is a widely used strategy by medicinal chemists, it offers limited capability in terms of exploring the space of structural modifications, therefore does not cover the complete space of solutions. Often more general transformations beyond the nature of MMPs are feasible and/or necessary, e.g. simultaneous modifications of the starting molecule at different places including the core scaffold. This study aims to provide a general methodology that offers more general structural modifications beyond MMPs. In particular, the same Transformer architecture is trained on different datasets. These datasets consist of a set of molecular pairs which reflect different types of transformations. Beyond MMP transformation, datasets reflecting general structural changes are constructed from ChEMBL based on two approaches: Tanimoto similarity (allows for multiple modifications) and scaffold matching (allows for multiple modifications but keep the scaffold constant) respectively. We investigate how the model behavior can be altered by tailoring the dataset while using the same model architecture. Our results show that the models trained on differently prepared datasets transform a given starting molecule in a way that it reflects the nature of the dataset used for training the model. These models could complement each other and unlock the capability for the chemists to pursue different options for improving a starting molecule.

摘要

分子优化旨在改善起始分子的药物特性。这是药物发现中的一个基本问题，但具有挑战性，原因如下：（i）需要同时优化多种特性；（ii）需要探索的化学空间很大。最近，有人提出了深度学习方法，通过模仿药物化学家在匹配分子对（MMP）方面的直觉来解决这项任务。尽管MMP是药物化学家广泛使用的一种策略，但在探索结构修饰空间方面能力有限，因此并未涵盖完整的解决方案空间。通常，超出MMP性质的更一般的转化是可行的和/或必要的，例如在不同位置（包括核心支架）同时对起始分子进行修饰。本研究旨在提供一种通用方法，该方法能提供超出MMP范围的更一般的结构修饰。具体而言，在不同数据集上训练相同的Transformer架构。这些数据集由一组反映不同类型转化的分子对组成。除了MMP转化之外，基于两种方法从ChEMBL构建反映一般结构变化的数据集：分别是Tanimoto相似性（允许多重修饰）和支架匹配（允许多重修饰但保持支架不变）。我们研究了在使用相同模型架构的情况下，如何通过定制数据集来改变模型行为。我们的结果表明，在不同制备的数据集上训练的模型以一种反映用于训练模型的数据集性质的方式转化给定的起始分子。这些模型可以相互补充，为化学家提供更多选择，以改进起始分子。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c66a/8962145/ecb307946476/13321_2022_599_Fig1_HTML.jpg

相似文献

Transformer-based molecular optimization beyond matched molecular pairs.超越匹配分子对的基于Transformer的分子优化。

J Cheminform. 2022 Mar 28;14(1):18. doi: 10.1186/s13321-022-00599-3.

Molecular optimization by capturing chemist's intuition using deep neural networks.通过使用深度神经网络捕捉化学家的直觉进行分子优化。

J Cheminform. 2021 Mar 20;13(1):26. doi: 10.1186/s13321-021-00497-0.

Evaluation of reinforcement learning in transformer-based molecular design.基于Transformer的分子设计中强化学习的评估

J Cheminform. 2024 Aug 8;16(1):95. doi: 10.1186/s13321-024-00887-0.

Transformer-based deep learning method for optimizing ADMET properties of lead compounds.基于Transformer的深度学习方法用于优化先导化合物的ADMET性质。

Phys Chem Chem Phys. 2023 Jan 18;25(3):2377-2385. doi: 10.1039/d2cp05332b.

Prediction-driven matched molecular pairs to interpret QSARs and aid the molecular optimization process.基于预测的匹配分子对用于解释 QSAR 并辅助分子优化过程。

J Cheminform. 2014 Dec 11;6(1):48. doi: 10.1186/s13321-014-0048-0. eCollection 2014.

Deep scaffold hopping with multimodal transformer neural networks.基于多模态变压器神经网络的深度骨架跳跃

J Cheminform. 2021 Nov 13;13(1):87. doi: 10.1186/s13321-021-00565-5.

DrugEx v3: scaffold-constrained drug design with graph transformer-based reinforcement learning.DrugEx v3：基于图变换器强化学习的支架约束药物设计

J Cheminform. 2023 Feb 20;15(1):24. doi: 10.1186/s13321-023-00694-z.

Chemical rules for optimization of chemical mutagenicity via matched molecular pairs analysis and machine learning methods.通过匹配分子对分析和机器学习方法优化化学诱变性的化学规则。

J Cheminform. 2023 Mar 20;15(1):35. doi: 10.1186/s13321-023-00707-x.

OptADMET: a web-based tool for substructure modifications to improve ADMET properties of lead compounds.OptADMET：一个用于改善先导化合物 ADMET 性质的基于网络的结构修饰工具。

Nat Protoc. 2024 Apr;19(4):1105-1121. doi: 10.1038/s41596-023-00942-4. Epub 2024 Jan 23.

Can We Quickly Learn to "Translate" Bioactive Molecules with Transformer Models?我们能否利用Transformer模型快速学会“翻译”生物活性分子？

J Chem Inf Model. 2023 Mar 27;63(6):1734-1744. doi: 10.1021/acs.jcim.2c01618. Epub 2023 Mar 13.

引用本文的文献

MolMod: a molecular modification platform for molecular property optimization via fragment-based generation.MolMod：一个通过基于片段生成来优化分子性质的分子修饰平台。

Mol Divers. 2025 Sep 4. doi: 10.1007/s11030-025-11342-z.

Applications of Artificial Intelligence in Biotech Drug Discovery and Product Development.人工智能在生物技术药物发现与产品开发中的应用。

MedComm (2020). 2025 Jul 30;6(8):e70317. doi: 10.1002/mco2.70317. eCollection 2025 Aug.

Comparison study of dominant molecular sequence representation based on diffusion model.基于扩散模型的显性分子序列表示比较研究

J Comput Aided Mol Des. 2025 Jul 18;39(1):54. doi: 10.1007/s10822-025-00614-3.

Developing muscarinic receptor M1 classification models utilizing transfer learning and generative AI techniques.利用迁移学习和生成式人工智能技术开发毒蕈碱受体M1分类模型。

Sci Rep. 2025 May 12;15(1):16486. doi: 10.1038/s41598-025-00972-w.

PepINVENT: generative peptide design beyond natural amino acids.PepINVENT：超越天然氨基酸的生成性肽设计。

Chem Sci. 2025 Apr 16;16(20):8682-8696. doi: 10.1039/d4sc07642g. eCollection 2025 May 21.

Accelerating discovery of bioactive ligands with pharmacophore-informed generative models.利用药效团信息生成模型加速生物活性配体的发现。

Nat Commun. 2025 Mar 10;16(1):2391. doi: 10.1038/s41467-025-56349-0.

Molecular optimization using a conditional transformer for reaction-aware compound exploration with reinforcement learning.使用条件变压器进行分子优化，通过强化学习实现反应感知化合物探索。

Commun Chem. 2025 Feb 8;8(1):40. doi: 10.1038/s42004-025-01437-x.

DrugAssist: a large language model for molecule optimization.DrugAssist：用于分子优化的大型语言模型。

Brief Bioinform. 2024 Nov 22;26(1). doi: 10.1093/bib/bbae693.

Exhaustive local chemical space exploration using a transformer model.使用变压器模型进行详尽的局部化学空间探索。

Nat Commun. 2024 Aug 25;15(1):7315. doi: 10.1038/s41467-024-51672-4.

Evaluation of reinforcement learning in transformer-based molecular design.基于Transformer的分子设计中强化学习的评估

J Cheminform. 2024 Aug 8;16(1):95. doi: 10.1186/s13321-024-00887-0.

本文引用的文献

Nonadditivity in public and inhouse data: implications for drug design.公共数据与内部数据的非加和性：对药物设计的影响。

J Cheminform. 2021 Jul 2;13(1):47. doi: 10.1186/s13321-021-00525-z.

Molecular optimization by capturing chemist's intuition using deep neural networks.通过使用深度神经网络捕捉化学家的直觉进行分子优化。

J Cheminform. 2021 Mar 20;13(1):26. doi: 10.1186/s13321-021-00497-0.

Evolution of Novartis' Small Molecule Screening Deck Design.诺华小分子筛选库设计的演变。

J Med Chem. 2020 Dec 10;63(23):14425-14447. doi: 10.1021/acs.jmedchem.0c01332. Epub 2020 Nov 3.

Efficient multi-objective molecular optimization in a continuous latent space.连续潜在空间中的高效多目标分子优化。

Chem Sci. 2019 Jul 8;10(34):8016-8024. doi: 10.1039/c9sc01928f. eCollection 2019 Sep 14.

Analyzing Learned Molecular Representations for Property Prediction.分析用于性质预测的学习分子表示。

J Chem Inf Model. 2019 Aug 26;59(8):3370-3388. doi: 10.1021/acs.jcim.9b00237. Epub 2019 Aug 13.

ChEMBL: towards direct deposition of bioassay data.ChEMBL：致力于直接生成生物测定数据。

Nucleic Acids Res. 2019 Jan 8;47(D1):D930-D940. doi: 10.1093/nar/gky1075.

Multi-objective de novo drug design with conditional graph generative model.基于条件图生成模型的多目标从头药物设计

J Cheminform. 2018 Jul 24;10(1):33. doi: 10.1186/s13321-018-0287-6.

Molecular generative model based on conditional variational autoencoder for de novo molecular design.基于条件变分自编码器的分子生成模型用于从头分子设计。

J Cheminform. 2018 Jul 11;10(1):31. doi: 10.1186/s13321-018-0286-7.

mmpdb: An Open-Source Matched Molecular Pair Platform for Large Multiproperty Data Sets.mmpdb：用于大型多属性数据集的开源匹配分子对平台。

J Chem Inf Model. 2018 May 29;58(5):902-910. doi: 10.1021/acs.jcim.8b00173. Epub 2018 May 17.

Reinforced Adversarial Neural Computer for de Novo Molecular Design.强化对抗神经网络计算机用于从头分子设计。

J Chem Inf Model. 2018 Jun 25;58(6):1194-1204. doi: 10.1021/acs.jcim.7b00690. Epub 2018 Jun 12.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

超越匹配分子对的基于Transformer的分子优化。

Transformer-based molecular optimization beyond matched molecular pairs.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献