Zhang Jiangbo, Peng Yunhui, Cui Feifei, Zhang Zilong, Yan Shankai, Zhang Qingchen
School of Computer Science and Technology, Hainan University, Haikou, 570100, Hainan, China.
School of Physics Science and Technology, Central China Normal University, Wuhan, 430000, Hubei, China.
BMC Bioinformatics. 2025 Jul 11;26(1):176. doi: 10.1186/s12859-025-06197-y.
RNA-binding proteins (RBPs) play crucial roles in gene regulation. Their dysregulation has been increasingly linked to neurodegenerative diseases, liver cancer, and lung cancer. Although experimental methods like CLIP-seq accurately identify RNA-protein binding sites, they are time-consuming and costly. To address this, we propose RMDNet-a deep learning framework that integrates CNN, CNN-Transformer, and ResNet branches to capture features at multiple sequence scales. These features are fused with structural representations derived from RNA secondary structure graphs. The graphs are processed using a graph neural network with DiffPool. To optimize feature integration, we incorporate an improved dung beetle optimization algorithm, which adaptively assigns fusion weights during inference. Evaluations on the RBP-24 benchmark show that RMDNet outperforms state-of-the-art models including GraphProt, DeepRKE, and DeepDW across multiple metrics. On the RBP-31 dataset, it demonstrates strong generalization ability, while ablation studies on RBPsuite2.0 validate the contributions of individual modules. We assess biological interpretability by extracting candidate binding motifs from the first-layer CNN kernels. Several motifs closely match experimentally validated RBP motifs, confirming the model's capacity to learn biologically meaningful patterns. A downstream case study on YTHDF1 focuses on analyzing interpretable spatial binding patterns, using a large-scale prediction dataset and CLIP-seq peak alignment. The results confirm that the model captures localized binding signals and spatial consistency with experimental annotations. Overall, RMDNet is a robust and interpretable tool for predicting RNA-protein binding sites. It has broad potential in disease mechanism research and therapeutic target discovery. The source code is available https://github.com/cskyan/RMDNet .
RNA结合蛋白(RBPs)在基因调控中发挥着关键作用。它们的失调与神经退行性疾病、肝癌和肺癌的关联日益增加。尽管像CLIP-seq这样的实验方法能够准确识别RNA-蛋白质结合位点,但它们既耗时又昂贵。为了解决这个问题,我们提出了RMDNet——一个深度学习框架,它整合了卷积神经网络(CNN)、卷积神经网络-Transformer(CNN-Transformer)和残差网络(ResNet)分支,以在多个序列尺度上捕捉特征。这些特征与从RNA二级结构图派生的结构表示进行融合。这些图使用带有DiffPool的图神经网络进行处理。为了优化特征整合,我们引入了一种改进的蜣螂优化算法,该算法在推理过程中自适应地分配融合权重。在RBP-24基准测试中的评估表明,RMDNet在多个指标上优于包括GraphProt、DeepRKE和DeepDW在内的现有最先进模型。在RBP-31数据集上,它展示了强大的泛化能力,而在RBPsuite2.0上的消融研究验证了各个模块的贡献。我们通过从第一层CNN内核中提取候选结合基序来评估生物学可解释性。几个基序与实验验证的RBP基序紧密匹配,证实了该模型学习生物学有意义模式的能力。关于YTHDF1的下游案例研究聚焦于使用大规模预测数据集和CLIP-seq峰比对来分析可解释的空间结合模式。结果证实该模型捕获了局部结合信号以及与实验注释的空间一致性。总体而言,RMDNet是一种用于预测RNA-蛋白质结合位点的强大且可解释的工具。它在疾病机制研究和治疗靶点发现方面具有广阔的潜力。源代码可在https://github.com/cskyan/RMDNet获取。