Langevin Maxime, Vuilleumier Rodolphe, Bianciotto Marc
Molecular Design Sciences - Integrated Drug Discovery, Sanofi R&D, 94400, Vitry-sur-Seine, France.
PASTEUR, Département de chimie, École Normale Supérieure, PSL University, Sorbonne Université, CNRS, 75005, Paris, France.
J Cheminform. 2022 Apr 1;14(1):20. doi: 10.1186/s13321-022-00601-y.
Despite growing interest and success in automated in-silico molecular design, questions remain regarding the ability of goal-directed generation algorithms to perform unbiased exploration of novel chemical spaces. A specific phenomenon has recently been highlighted: goal-directed generation guided with machine learning models produce molecules with high scores according to the optimization model, but low scores according to control models, even when trained on the same data distribution and the same target. In this work, we show that this worrisome behavior is actually due to issues with the predictive models and not the goal-directed generation algorithms. We show that with appropriate predictive models, this issue can be resolved, and molecules generated have high scores according to both the optimization and the control models.
尽管在自动化计算机辅助分子设计方面的兴趣与成果不断增加,但关于目标导向生成算法对新型化学空间进行无偏差探索的能力仍存在问题。最近凸显了一个特定现象:即使在相同的数据分布和相同目标上进行训练,由机器学习模型引导的目标导向生成会产生根据优化模型得分高但根据控制模型得分低的分子。在这项工作中,我们表明这种令人担忧的行为实际上是由于预测模型的问题而非目标导向生成算法。我们表明,使用适当的预测模型,这个问题可以得到解决,并且生成的分子根据优化模型和控制模型得分都很高。