Guan Jiahui, Yao Lantian, Xie Peilin, Zhao Zhihao, Meng Dian, Lee Tzong-Yi, Wang Junwen, Chiang Ying-Chih
Kobilka Institute of Innovative Drug Discovery, School of Medicine, The Chinese University of Hong Kong, Shenzhen, 2001 Longxiang Road, 518172, Shenzhen, China.
Division of Applied Oral Sciences and Community Dental Care, Faculty of Dentistry, The University of Hong Kong, 34 Hospital Road, Hong Kong SAR, China.
Brief Bioinform. 2025 May 1;26(3). doi: 10.1093/bib/bbaf292.
RNA-protein interactions (RPIs) are essential for many biological functions and are associated with various diseases. Traditional methods for detecting RPIs are labor-intensive and costly, necessitating efficient computational methods. In this study, we proposed a novel sequence-based RPI prediction framework based on graph neural networks (GNNs) that addressed key limitations of existing methods, such as inadequate feature integration and negative sample construction. Our method represented RNAs and proteins as nodes in a unified interaction graph, enhancing the representation of RPI pairs through multi-feature fusion and employing self-supervised learning strategies for model training. The model's performance was validated through five-fold cross-validation, achieving accuracy of 0.880, 0.811, 0.950, 0.979, 0.910, and 0.924 on the RPI488, RPI369, RPI2241, RPI1807, RPI1446, and RPImerged datasets, respectively. Additionally, in cross-species generalization tests, our method outperformed existing methods, achieving an overall accuracy of 0.989 across 10 093 RPI pairs. Compared with other state-of-the-art RPI prediction methods, our approach demonstrates greater robustness and stability in RPI prediction, highlighting its potential for broad biological applications and large-scale RPI analysis.
RNA-蛋白质相互作用(RPIs)对许多生物学功能至关重要,且与多种疾病相关。传统的检测RPIs的方法既费力又昂贵,因此需要高效的计算方法。在本研究中,我们提出了一种基于图神经网络(GNNs)的新型基于序列的RPI预测框架,该框架解决了现有方法的关键局限性,如特征整合不足和负样本构建问题。我们的方法将RNA和蛋白质表示为统一相互作用图中的节点,通过多特征融合增强RPI对的表示,并采用自监督学习策略进行模型训练。通过五折交叉验证对模型性能进行了验证,在RPI488、RPI369、RPI2241、RPI1807、RPI1446和RPImerged数据集上分别达到了0.880、0.811、0.950、0.979、0.910和0.924的准确率。此外,在跨物种泛化测试中,我们的方法优于现有方法,在10093个RPI对中实现了0.989的总体准确率。与其他最先进的RPI预测方法相比,我们的方法在RPI预测中表现出更高的鲁棒性和稳定性,突出了其在广泛生物学应用和大规模RPI分析中的潜力。