Wang Jianshi
Department of Systems Innovation, Graduate School of Engineering, Hongo Campus, The University of Tokyo, Tokyo 113-8656, Japan.
Os' Lab, Twin Towers South 17th Floor, 1-13-1 Umeda, Kita-ku, Osaka 530-0001, Japan.
Int J Mol Sci. 2025 Sep 5;26(17):8666. doi: 10.3390/ijms26178666.
Reliable prediction of chemical-protein interactions (CPIs) remains a key challenge in drug discovery, especially under sparse or noisy biological data. We present MM-TCoCPIn, a Multi-Modal Topology-aware Chemical-Protein Interaction Network that integrates three causally grounded modalities-network topology, biomedical semantics, and a 3D protein structure-into an interpretable graph learning framework. The model processes topological features via a CTC (Comprehensive Topological Characteristics)-based encoder, literature-derived semantics via SciBERT (Scientific Bidirectional Encoder Representations from Transformers), and structural geometry via a GVP-GNN (Geometric Vector Perceptron Graph Neural Network) applied to AlphaFold2 contact graphs. Evaluation on datasets from STITCH, STRING, and PubMed shows that MM-TCoCPIn achieves state-of-the-art performance (AUC = 0.93, F1 = 0.92), outperforming uni-modal baselines. Importantly, ablation and counterfactual analyses confirm that each modality contributes distinct biological insight: topology ensures robustness, semantics enhance recall, and structure sharpens precision. This framework offers a scalable and causally interpretable solution for CPI modeling, bridging the gap between predictive accuracy and mechanistic understanding.
化学-蛋白质相互作用(CPI)的可靠预测仍然是药物发现中的一个关键挑战,尤其是在稀疏或有噪声的生物学数据情况下。我们提出了MM-TCoCPIn,这是一种多模态拓扑感知化学-蛋白质相互作用网络,它将三种基于因果关系的模态——网络拓扑、生物医学语义和三维蛋白质结构——集成到一个可解释的图学习框架中。该模型通过基于CTC(综合拓扑特征)的编码器处理拓扑特征,通过SciBERT(来自Transformer的科学双向编码器表示)处理文献衍生的语义,并通过应用于AlphaFold2接触图的GVP-GNN(几何向量感知器图神经网络)处理结构几何。对来自STITCH、STRING和PubMed数据集的评估表明,MM-TCoCPIn实现了当前最优性能(AUC = 0.93,F1 = 0.92),优于单模态基线。重要的是,消融和反事实分析证实,每种模态都提供了独特的生物学见解:拓扑确保稳健性,语义提高召回率,结构提高精确率。该框架为CPI建模提供了一种可扩展且具有因果可解释性的解决方案,弥合了预测准确性和机理理解之间的差距。