Department of Computer Science, Brock University, 1812 Sir Isaac Brock Way, St. Catharines, L2S 3A1, Ontario, Canada.
Digital Technologies Research Centre, National Research Council Canada, 1200 Montreal Road, Ottawa, K1A 0R6, Ontario, Canada.
Biosystems. 2023 Oct;232:104989. doi: 10.1016/j.biosystems.2023.104989. Epub 2023 Aug 6.
Drug design and optimization are challenging tasks that call for strategic and efficient exploration of the extremely vast search space. Multiple fragmentation strategies have been proposed in the literature to mitigate the complexity of the molecular search space. From an optimization standpoint, drug design can be considered as a multi-objective optimization problem. Deep reinforcement learning (DRL) frameworks have demonstrated encouraging results in the field of drug design. However, the scalability of these frameworks is impeded by substantial training intervals and inefficient use of sample data. In this paper, we (1) examine the core principles of deep or multi-objective RL methods and their applications in molecular design, (2) analyze the performance of a recent multi-objective DRL-based and fragment-based drug design framework, named DeepFMPO, in a real-world application by incorporating optimization of protein-ligand docking affinity with varying numbers of other objectives, and (3) compare this method with a single-objective variant. Through trials, our results indicate that the DeepFMPO framework (with docking score) can achieve success, however, it suffers from training instability. Our findings encourage additional exploration and improvement of the framework. Potential sources of the framework's instability and suggestions of further modifications to stabilize the framework are discussed.
药物设计和优化是具有挑战性的任务,需要对极其广阔的搜索空间进行战略性和有效的探索。文献中提出了多种碎片策略,以减轻分子搜索空间的复杂性。从优化的角度来看,药物设计可以被视为一个多目标优化问题。深度强化学习 (DRL) 框架在药物设计领域取得了令人鼓舞的成果。然而,这些框架的可扩展性受到大量训练间隔和低效使用样本数据的阻碍。在本文中,我们:(1)研究了深度或多目标 RL 方法的核心原理及其在分子设计中的应用;(2)通过将不同数量的其他目标与蛋白质-配体对接亲和力的优化相结合,分析了名为 DeepFMPO 的基于多目标和基于碎片的最新药物设计框架的性能在实际应用中;(3)将该方法与单目标变体进行了比较。通过试验,我们的结果表明,DeepFMPO 框架(具有对接分数)可以取得成功,但它存在训练不稳定的问题。我们的发现鼓励对该框架进行进一步的探索和改进。讨论了框架不稳定的潜在原因以及进一步修改以稳定框架的建议。