RediscMol：在生物性质方面对分子生成模型进行基准测试。

RediscMol: Benchmarking Molecular Generation Models in Biological Properties.

机构信息

Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang UniversityHangzhou 310058, Zhejiang, China.

Advanced Computing and Storage Laboratory, Central Research Institute, 2012 Laboratories, Huawei Technologies Co., Ltd., Shenzhen 518129, Guangdong, China.

出版信息

J Med Chem. 2024 Jan 25;67(2):1533-1543. doi: 10.1021/acs.jmedchem.3c02051. Epub 2024 Jan 5.

DOI:10.1021/acs.jmedchem.3c02051

PMID:38181194

Abstract

Deep learning-based molecular generative models have garnered emerging attention for their capability to generate molecules with novel structures and desired physicochemical properties. However, the evaluation of these models, particularly in a biological context, remains insufficient. To address the limitations of existing metrics and emulate practical application scenarios, we construct the RediscMol benchmark that comprises active molecules extracted from 5 kinase and 3 GPCR data sets. A set of rediscovery- and similarity-related metrics are introduced to assess the performance of 8 representative generative models (CharRNN, VAE, Reinvent, AAE, ORGAN, RNNAttn, TransVAE, and GraphAF). Our findings based on the RediscMol benchmark differ from those of previous evaluations. CharRNN, VAE, and Reinvent exhibit a greater ability to reproduce known active molecules, while RNNAttn, TransVAE, and GraphAF struggle in this aspect despite their notable performance on commonly used distribution-learning metrics. Our evaluation framework may provide valuable guidance for advancing generative models in real-world drug design scenarios.

摘要

基于深度学习的分子生成模型因其能够生成具有新颖结构和所需物理化学性质的分子而引起了人们的关注。然而，这些模型的评估，特别是在生物学背景下的评估仍然不足。为了解决现有指标的局限性并模拟实际应用场景，我们构建了 RediscMol 基准，该基准包含从 5 个激酶和 3 个 GPCR 数据集提取的活性分子。引入了一组重新发现和相似性相关的指标来评估 8 个代表性生成模型（CharRNN、VAE、Reinvent、AAE、ORGAN、RNNAttn、TransVAE 和 GraphAF）的性能。我们基于 RediscMol 基准的发现与以前的评估结果不同。CharRNN、VAE 和 Reinvent 表现出更强的能力来再现已知的活性分子，而 RNNAttn、TransVAE 和 GraphAF 在这方面表现不佳，尽管它们在常用的分布学习指标上表现出色。我们的评估框架可能为推进实际药物设计场景中的生成模型提供有价值的指导。