Magicmol：一个轻量级的药物分子进化和快速化学空间探索的流水线。

Magicmol: a light-weighted pipeline for drug-like molecule evolution and quick chemical space exploration.

机构信息

Yangtze Delta Region (Huzhou) Institute of Intelligent Transportation, Huzhou University, Huzhou, China.

Zhejiang Province Key Laboratory of Smart Management and Application of Modern Agricultural Resources, School of Information Engineering, Huzhou University, Huzhou, China.

出版信息

BMC Bioinformatics. 2023 Apr 26;24(1):173. doi: 10.1186/s12859-023-05286-0.

DOI:10.1186/s12859-023-05286-0

PMID:37101113

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10132416/

Abstract

The flourishment of machine learning and deep learning methods has boosted the development of cheminformatics, especially regarding the application of drug discovery and new material exploration. Lower time and space expenses make it possible for scientists to search the enormous chemical space. Recently, some work combined reinforcement learning strategies with recurrent neural network (RNN)-based models to optimize the property of generated small molecules, which notably improved a batch of critical factors for these candidates. However, a common problem among these RNN-based methods is that several generated molecules have difficulty in synthesizing despite owning higher desired properties such as binding affinity. However, RNN-based framework better reproduces the molecule distribution among the training set than other categories of models during molecule exploration tasks. Thus, to optimize the whole exploration process and make it contribute to the optimization of specified molecules, we devised a light-weighted pipeline called Magicmol; this pipeline has a re-mastered RNN network and utilize SELFIES presentation instead of SMILES. Our backbone model achieved extraordinary performance while reducing the training cost; moreover, we devised reward truncate strategies to eliminate the model collapse problem. Additionally, adopting SELFIES presentation made it possible to combine STONED-SELFIES as a post-processing procedure for specified molecule optimization and quick chemical space exploration.

摘要

机器学习和深度学习方法的蓬勃发展推动了化学信息学的发展，特别是在药物发现和新材料探索方面的应用。较低的时间和空间成本使得科学家有可能搜索巨大的化学空间。最近，一些工作将强化学习策略与基于递归神经网络（RNN）的模型相结合，以优化生成小分子的性质，这显著提高了这些候选物的一批关键因素。然而，这些基于 RNN 的方法的一个共同问题是，尽管生成的分子具有较高的理想性质，如结合亲和力，但仍有一些难以合成。然而，在分子探索任务中，基于 RNN 的框架比其他类别的模型更能在训练集中再现分子分布。因此，为了优化整个探索过程，并使其有助于优化指定的分子，我们设计了一个名为 Magicmol 的轻量级管道；该管道具有重新掌握的 RNN 网络，并使用 SELFIES 表示而不是 SMILES。我们的骨干模型在降低训练成本的同时实现了卓越的性能；此外，我们设计了奖励截断策略来消除模型崩溃问题。此外，采用 SELFIES 表示使得可以将 STONED-SELFIES 作为指定分子优化和快速化学空间探索的后处理过程。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

Magicmol：一个轻量级的药物分子进化和快速化学空间探索的流水线。

Magicmol: a light-weighted pipeline for drug-like molecule evolution and quick chemical space exploration.

机构信息

出版信息

相似文献

本文引用的文献

Magicmol：一个轻量级的药物分子进化和快速化学空间探索的流水线。

Magicmol: a light-weighted pipeline for drug-like molecule evolution and quick chemical space exploration.

机构信息

出版信息

相似文献

本文引用的文献