J Chem Inf Model. 2020 Jan 27;60(1):29-36. doi: 10.1021/acs.jcim.9b00694. Epub 2019 Dec 24.
Deep generative models are attracting great attention as a new promising approach for molecular design. A variety of models reported so far are based on either a variational autoencoder (VAE) or a generative adversarial network (GAN), but they have limitations such as low validity and uniqueness. Here, we propose a new type of model based on an adversarially regularized autoencoder (ARAE). It basically uses latent variables like VAE, but the distribution of the latent variables is estimated by adversarial training like in GAN. The latter is intended to avoid both the insufficiently flexible approximation of posterior distribution in VAE and the difficulty in handling discrete variables in GAN. Our benchmark study showed that ARAE indeed outperformed conventional models in terms of validity, uniqueness, and novelty per generated molecule. We also demonstrated a successful conditional generation of drug-like molecules with ARAE for the control of both cases of single and multiple properties. As a potential real-world application, we could generate epidermal growth factor receptor inhibitors sharing the scaffolds of known active molecules while satisfying drug-like conditions simultaneously.
深度生成模型作为一种分子设计的新方法,正受到广泛关注。迄今为止报道的各种模型都是基于变分自动编码器(VAE)或生成对抗网络(GAN),但它们存在有效性和独特性低等局限性。在这里,我们提出了一种基于对抗正则化自动编码器(ARAE)的新型模型。它基本上像 VAE 一样使用潜在变量,但通过对抗训练来估计潜在变量的分布,就像在 GAN 中一样。后者旨在避免 VAE 中后验分布的灵活性不足以及 GAN 中处理离散变量的困难。我们的基准研究表明,ARAE 在有效性、独特性和生成分子的新颖性方面确实优于传统模型。我们还通过 ARAE 成功地对药物样分子进行了条件生成,以控制单一和多种特性的情况。作为一种潜在的实际应用,我们可以生成表皮生长因子受体抑制剂,同时共享已知活性分子的支架,并满足药物样条件。