Wijaya Kevin Tirta, Ansari Navid, Seidel Hans-Peter, Babaei Vahid
Max Planck Institute for Informatics, Saarland Informatic Campus, 66123, Saarbruecken, Germany.
Adv Sci (Weinh). 2025 Jul;12(27):e2416356. doi: 10.1002/advs.202416356. Epub 2025 May 8.
Data-driven inverse molecular design (IMD) has attracted significant attention in recent years. Despite the remarkable progress, existing IMD methods lag behind in terms of trustworthiness, as indicated by their misalignment to the ground-truth function that models the molecular dynamics. Here, TrustMol, an IMD method built to be trustworthy is proposed by inverting a reliable molecular property predictor. TrustMol first constructs a latent space with a novel variational autoencoder (VAE) and trains an ensemble of property predictors to learn the mapping from the latent space to the property space. The training samples for the ensemble are obtained from a new reacquisition method to ensure that the samples are representative of the latent space. To generate a desired molecule, TrustMol optimizes a latent design by minimizing both the predictive error and the uncertainty quantified by the ensemble. As a result, TrustMol achieves state-of-the-art performance in terms of IMD accuracy, and more importantly, it is aligned with the ground-truth function that indicates trustworthiness.
近年来,数据驱动的逆分子设计(IMD)引起了广泛关注。尽管取得了显著进展,但现有的IMD方法在可信度方面仍存在不足,这体现在它们与模拟分子动力学的真实函数不一致。在此,通过反转可靠的分子性质预测器,提出了一种值得信赖的IMD方法TrustMol。TrustMol首先用一种新颖的变分自编码器(VAE)构建一个潜在空间,并训练一组性质预测器来学习从潜在空间到性质空间的映射。该组的训练样本通过一种新的重新采集方法获得,以确保样本能够代表潜在空间。为了生成所需的分子,TrustMol通过最小化预测误差和由该组量化的不确定性来优化潜在设计。结果,TrustMol在IMD准确性方面达到了当前的最佳性能,更重要的是,它与表明可信度的真实函数一致。