School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou 510006, China.
Galixir Technologies (Beijing) Limited, Beijing 100083, China.
J Chem Inf Model. 2021 Apr 26;61(4):1627-1636. doi: 10.1021/acs.jcim.0c01416. Epub 2021 Mar 17.
The goal of molecular optimization (MO) is to discover molecules that acquire improved pharmaceutical properties over a known starting molecule. Despite many recent successes of new approaches for MO, these methods were typically developed for particular properties with rich annotated training examples. Thus, these approaches are difficult to implement in real scenes where only a small amount of pharmaceutical data is usually available due to the expense and significant effort required for the data collection. Here, we propose a new approach, Meta-MO, for molecular optimization with a handful of training samples based on the well-recognized first-order meta-learning algorithms. By using a set of meta tasks with rich training samples, Meta-MO trains a meta model through the meta-learning optimization and adapts the learned model to new low-resource MO tasks. Meta-MO was shown to consistently outperform several pretraining and multitask training procedures, providing an average improvement in the success rate of 4.3% on a large-scale bioactivity data set with diverse target variations. We also observed that Meta-MO resulted in the best performing models across fine-tuning sets with only dozens of samples. To the best of our knowledge, this is the first study to apply meta learning to MO tasks. More importantly, such a strategy could be further extended to many low-resource scenarios in real-world drug design.
分子优化(MO)的目标是发现具有优于已知起始分子的改进药物性质的分子。尽管最近有许多新方法在 MO 方面取得了成功,但这些方法通常是针对具有丰富注释训练示例的特定性质开发的。因此,由于数据收集的成本和巨大工作量,这些方法在实际场景中难以实施,因为通常只有少量的药物数据可用。在这里,我们提出了一种新的方法,即 Meta-MO,用于在只有少数训练样本的情况下进行分子优化,该方法基于广为人知的一阶元学习算法。通过使用一组具有丰富训练样本的元任务,Meta-MO 通过元学习优化训练元模型,并将学习到的模型适应新的低资源 MO 任务。Meta-MO 始终优于几种预训练和多任务训练过程,在具有多种目标变化的大规模生物活性数据集上,成功率平均提高了 4.3%。我们还观察到,Meta-MO 在只有几十个样本的微调集上产生了性能最佳的模型。据我们所知,这是第一项将元学习应用于 MO 任务的研究。更重要的是,这种策略可以进一步扩展到现实世界药物设计中的许多低资源场景。