Key Laboratory of Intelligent Computing and Signal Processing of Ministry of Education, Anhui University, Hefei 230601, Anhui, P. R. China.
School of Computer Science and Technology, Anhui University, Hefei 230601, Anhui, P. R. China.
J Bioinform Comput Biol. 2022 Aug;20(4):2250006. doi: 10.1142/S0219720022500068. Epub 2022 Apr 21.
RNA-binding proteins (RBPs) have crucial roles in various cellular processes such as alternative splicing and gene regulation. Therefore, the analysis and identification of RBPs is an essential issue. However, although many computational methods have been developed for predicting RBPs, a few studies simultaneously consider local and global information from the perspective of the RNA sequence. Facing this challenge, we present a novel method called DeepBtoD, which predicts RBPs directly from RNA sequences. First, a [Formula: see text]-BtoD encoding is designed, which takes into account the composition of [Formula: see text]-nucleotides and their relative positions and forms a local module. Second, we designed a multi-scale convolutional module embedded with a self-attentive mechanism, the ms-focusCNN, which is used to further learn more effective, diverse, and discriminative high-level features. Finally, global information is considered to supplement local modules with ensemble learning to predict whether the target RNA binds to RBPs. Our preliminary 24 independent test datasets show that our proposed method can classify RBPs with the area under the curve of 0.933. Remarkably, DeepBtoD shows competitive results across seven state-of-the-art methods, suggesting that RBPs can be highly recognized by integrating local [Formula: see text]-BtoD and global information only from RNA sequences. Hence, our integrative method may be useful to improve the power of RBPs prediction, which might be particularly useful for modeling protein-nucleic acid interactions in systems biology studies. Our DeepBtoD server can be accessed at http://175.27.228.227/DeepBtoD/.
RNA 结合蛋白 (RBPs) 在剪接体中 RNA 加工、基因转录和翻译调控等多种细胞过程中发挥着关键作用。因此,分析和鉴定 RBPs 是一个至关重要的问题。然而,尽管已经开发了许多用于预测 RBPs 的计算方法,但从 RNA 序列的角度来看,很少有研究同时考虑局部和全局信息。针对这一挑战,我们提出了一种名为 DeepBtoD 的新方法,该方法可以直接从 RNA 序列中预测 RBPs。首先,设计了一个 [Formula: see text]-BtoD 编码,该编码考虑了 [Formula: see text]-核苷酸的组成及其相对位置,并形成了一个局部模块。其次,我们设计了一个多尺度卷积模块,嵌入了自注意力机制 ms-focusCNN,用于进一步学习更有效、多样化和有区别的高级特征。最后,考虑全局信息,通过集成学习用全局信息补充局部模块,以预测目标 RNA 是否与 RBPs 结合。我们的 24 个独立测试数据集的初步结果表明,我们提出的方法可以将 RBPs 分类,曲线下面积为 0.933。值得注意的是,DeepBtoD 在七种最先进的方法中表现出了具有竞争力的结果,这表明仅从 RNA 序列中整合局部 [Formula: see text]-BtoD 和全局信息就可以高度识别 RBPs。因此,我们的综合方法可能有助于提高 RBPs 预测的能力,这对于系统生物学研究中的蛋白质-核酸相互作用建模可能特别有用。我们的 DeepBtoD 服务器可以在 http://175.27.228.227/DeepBtoD/ 访问。