通过约束结构进行减少标记的分子生成。

Molecular Generation with Reduced Labeling through Constraint Architecture.

机构信息

College of Pharmaceutical Sciences and Cancer Center, Zhejiang University, Hangzhou, Zhejiang 310058, P. R. China.

School of Computer Science, Wuhan University, Wuhan, Hubei 430072, P. R. China.

出版信息

J Chem Inf Model. 2023 Jun 12;63(11):3319-3327. doi: 10.1021/acs.jcim.3c00579. Epub 2023 May 15.

DOI:10.1021/acs.jcim.3c00579

PMID:37184885

Abstract

In the past few years, a number of machine learning (ML)-based molecular generative models have been proposed for generating molecules with desirable properties, but they all require a large amount of label data of pharmacological and physicochemical properties. However, experimental determination of these labels, especially bioactivity labels, is very expensive. In this study, we analyze the dependence of various multi-property molecule generation models on biological activity label data and propose Frag-G/M, a fragment-based multi-constraint molecular generation framework based on conditional transformer, recurrent neural networks (RNNs), and reinforcement learning (RL). The experimental results illustrate that, using the same number of labels, Frag-G/M can generate more desired molecules than the baselines (several times more than the baselines). Moreover, compared with the known active compounds, the molecules generated by Frag-G/M exhibit higher scaffold diversity than those generated by the baselines, thus making it more promising to be used in real-world drug discovery scenarios.

摘要

在过去的几年中，已经提出了许多基于机器学习 (ML) 的分子生成模型，用于生成具有理想性质的分子，但它们都需要大量药理学和物理化学性质的标签数据。然而，这些标签的实验测定，特别是生物活性标签的测定非常昂贵。在这项研究中，我们分析了各种多属性分子生成模型对生物活性标签数据的依赖性，并提出了 Frag-G/M，这是一种基于条件转换器、递归神经网络 (RNN) 和强化学习 (RL) 的基于片段的多约束分子生成框架。实验结果表明，使用相同数量的标签，Frag-G/M 可以生成比基线更理想的分子（比基线多几倍）。此外，与已知的活性化合物相比，Frag-G/M 生成的分子具有更高的骨架多样性，因此在实际的药物发现场景中更有应用前景。

相似文献

Molecular Generation with Reduced Labeling through Constraint Architecture.通过约束结构进行减少标记的分子生成。

J Chem Inf Model. 2023 Jun 12;63(11):3319-3327. doi: 10.1021/acs.jcim.3c00579. Epub 2023 May 15.

FSM-DDTR: End-to-end feedback strategy for multi-objective De Novo drug design using transformers.FSM-DDTR：使用变压器的多目标从头药物设计的端到端反馈策略。

Comput Biol Med. 2023 Sep;164:107285. doi: 10.1016/j.compbiomed.2023.107285. Epub 2023 Jul 31.

Generative machine learning for de novo drug discovery: A systematic review.生成式机器学习在从头药物发现中的应用：系统评价。

Comput Biol Med. 2022 Jun;145:105403. doi: 10.1016/j.compbiomed.2022.105403. Epub 2022 Mar 13.

Comprehensive assessment of deep generative architectures for de novo drug design.从头设计药物的深度生成式架构的综合评估。

Brief Bioinform. 2022 Jan 17;23(1). doi: 10.1093/bib/bbab544.

GRELinker: A Graph-Based Generative Model for Molecular Linker Design with Reinforcement and Curriculum Learning.GRELinker：一种基于图的生成模型，用于通过强化学习和课程学习进行分子连接体设计。

J Chem Inf Model. 2024 Feb 12;64(3):666-676. doi: 10.1021/acs.jcim.3c01700. Epub 2024 Jan 19.

Faster and more diverse de novo molecular optimization with double-loop reinforcement learning using augmented SMILES.使用增强型 SMILES 进行双环强化学习，实现更快、更多样的从头分子优化。

J Comput Aided Mol Des. 2023 Aug;37(8):373-394. doi: 10.1007/s10822-023-00512-6. Epub 2023 Jun 17.

CMGN: a conditional molecular generation net to design target-specific molecules with desired properties.CMGN：一种有条件的分子生成网络，用于设计具有所需性质的目标特定分子。

Brief Bioinform. 2023 Jul 20;24(4). doi: 10.1093/bib/bbad185.

Training recurrent neural networks as generative neural networks for molecular structures: how does it impact drug discovery?将循环神经网络训练为生成式神经网络用于分子结构：它如何影响药物发现？

Expert Opin Drug Discov. 2022 Oct;17(10):1071-1079. doi: 10.1080/17460441.2023.2134340. Epub 2022 Oct 17.

High-content image generation for drug discovery using generative adversarial networks.基于生成对抗网络的药物发现高内涵图像生成。

Neural Netw. 2020 Dec;132:353-363. doi: 10.1016/j.neunet.2020.09.007. Epub 2020 Sep 20.

Evaluation of reinforcement learning in transformer-based molecular design.基于Transformer的分子设计中强化学习的评估

J Cheminform. 2024 Aug 8;16(1):95. doi: 10.1186/s13321-024-00887-0.

引用本文的文献

Effective generation of heavy-atom-free triplet photosensitizers containing multiple intersystem crossing mechanisms based on deep learning.基于深度学习有效生成包含多种系间窜越机制的无重原子三重态光敏剂。

Chem Sci. 2025 Jul 8. doi: 10.1039/d5sc03192c.

Token-Mol 1.0: tokenized drug design with large language models.Token-Mol 1.0：基于大语言模型的标记化药物设计

Nat Commun. 2025 May 13;16(1):4416. doi: 10.1038/s41467-025-59628-y.

A structure-based framework for selective inhibitor design and optimization.一种用于选择性抑制剂设计与优化的基于结构的框架。

Commun Biol. 2025 Mar 12;8(1):422. doi: 10.1038/s42003-025-07840-3.

3DSMILES-GPT: 3D molecular pocket-based generation with token-only large language model.3DSMILES-GPT：基于仅含标记的大语言模型的三维分子口袋生成法。

Chem Sci. 2024 Dec 4;16(2):637-648. doi: 10.1039/d4sc06864e. eCollection 2025 Jan 2.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

通过约束结构进行减少标记的分子生成。

Molecular Generation with Reduced Labeling through Constraint Architecture.

机构信息

出版信息

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献