Shino Yuto, Kaneko Hiromasa
Department of Applied Chemistry, School of Science and Technology, Meiji University, 1-1-1 Higashi-Mita, Tama-ku, Kawasaki, Kanagawa 214-8571, Japan.
Mol Inform. 2025 Jan;44(1):e202400227. doi: 10.1002/minf.202400227.
Recent advances in machine learning have significantly impacted molecular design, notably the molecular generation method combining the chemical variational autoencoder (VAE) with Gaussian mixture regression (GMR). In this method, a mathematical model is constructed with X as the latent variable of the molecule and Y as the target properties and activities. Through direct inverse analysis of this model, it is possible to generate molecules with the desired target properties. However, this approach outputs many strings that do not follow the simplified molecular input line entry system grammar and generates unrealistic chemical structures in which the properties and activity do not satisfy the target values. In this study, we focus on hierarchical VAE using molecular graphs to address these issues. We confirm that the combination of hierarchical VAE and GMR does not generate invalid outputs and returns molecules that simultaneously satisfy multiple target values. Moreover, we use this method to identify several molecules that are predicted to exhibit activity against drug targets.
机器学习的最新进展对分子设计产生了重大影响,特别是将化学变分自编码器(VAE)与高斯混合回归(GMR)相结合的分子生成方法。在这种方法中,构建了一个以X作为分子的潜在变量、Y作为目标性质和活性的数学模型。通过对该模型进行直接逆分析,可以生成具有所需目标性质的分子。然而,这种方法输出的许多字符串不符合简化分子输入线性输入系统语法,并生成了性质和活性不满足目标值的不现实化学结构。在本研究中,我们专注于使用分子图的分层VAE来解决这些问题。我们证实,分层VAE和GMR的组合不会生成无效输出,并返回同时满足多个目标值的分子。此外,我们使用这种方法鉴定了几种预计对药物靶点具有活性的分子。