School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou, China.
Global Institute of Future Technology, Shanghai Jiao Tong University, Shanghai, China.
Nat Commun. 2024 May 25;15(1):4476. doi: 10.1038/s41467-024-48801-4.
Protein functions are characterized by interactions with proteins, drugs, and other biomolecules. Understanding these interactions is essential for deciphering the molecular mechanisms underlying biological processes and developing new therapeutic strategies. Current computational methods mostly predict interactions based on either molecular network or structural information, without integrating them within a unified multi-scale framework. While a few multi-view learning methods are devoted to fusing the multi-scale information, these methods tend to rely intensively on a single scale and under-fitting the others, likely attributed to the imbalanced nature and inherent greediness of multi-scale learning. To alleviate the optimization imbalance, we present MUSE, a multi-scale representation learning framework based on a variant expectation maximization to optimize different scales in an alternating procedure over multiple iterations. This strategy efficiently fuses multi-scale information between atomic structure and molecular network scale through mutual supervision and iterative optimization. MUSE outperforms the current state-of-the-art models not only in molecular interaction (protein-protein, drug-protein, and drug-drug) tasks but also in protein interface prediction at the atomic structure scale. More importantly, the multi-scale learning framework shows potential for extension to other scales of computational drug discovery.
蛋白质的功能是通过与蛋白质、药物和其他生物分子的相互作用来体现的。理解这些相互作用对于破译生物过程的分子机制和开发新的治疗策略至关重要。目前的计算方法大多基于分子网络或结构信息来预测相互作用,而没有将它们整合到一个统一的多尺度框架中。虽然有一些多视图学习方法致力于融合多尺度信息,但这些方法往往过于依赖单一尺度,对其他尺度拟合不足,这可能归因于多尺度学习的不平衡本质和内在贪婪性。为了缓解优化不平衡问题,我们提出了 MUSE,这是一种基于变分期望最大化的多尺度表示学习框架,通过在多个迭代过程中的交替过程中优化不同的尺度。这种策略通过相互监督和迭代优化,有效地融合了原子结构和分子网络尺度之间的多尺度信息。MUSE 在分子相互作用(蛋白质-蛋白质、药物-蛋白质和药物-药物)任务以及原子结构尺度上的蛋白质界面预测方面,不仅优于当前的最先进模型,而且还优于当前的最先进模型。更重要的是,多尺度学习框架显示出在其他计算药物发现尺度上扩展的潜力。