School of Computer Science, Shaanxi Normal University, Xi'an, 710119, China.
College of Information Science and Engineering, Guilin University of Technology, Guilin, 541004, China.
BMC Bioinformatics. 2021 Jan 7;22(1):19. doi: 10.1186/s12859-020-03942-3.
Circular RNAs (circRNAs) are widely expressed in cells and tissues and are involved in biological processes and human diseases. Recent studies have demonstrated that circRNAs can interact with RNA-binding proteins (RBPs), which is considered an important aspect for investigating the function of circRNAs.
In this study, we design a slight variant of the capsule network, called circRB, to identify the sequence specificities of circRNAs binding to RBPs. In this model, the sequence features of circRNAs are extracted by convolution operations, and then, two dynamic routing algorithms in a capsule network are employed to discriminate between different binding sites by analysing the convolution features of binding sites. The experimental results show that the circRB method outperforms the existing computational methods. Afterwards, the trained models are applied to detect the sequence motifs on the seven circRNA-RBP bound sequence datasets and matched to known human RNA motifs. Some motifs on circular RNAs overlap with those on linear RNAs. Finally, we also predict binding sites on the reported full-length sequences of circRNAs interacting with RBPs, attempting to assist current studies. We hope that our model will contribute to better understanding the mechanisms of the interactions between RBPs and circRNAs.
In view of the poor studies about the sequence specificities of circRNA-binding proteins, we designed a classification framework called circRB based on the capsule network. The results show that the circRB method is an effective method, and it achieves higher prediction accuracy than other methods.
环状 RNA(circRNAs)广泛存在于细胞和组织中,参与生物过程和人类疾病。最近的研究表明,circRNAs 可以与 RNA 结合蛋白(RBPs)相互作用,这被认为是研究 circRNAs 功能的一个重要方面。
在这项研究中,我们设计了一个胶囊网络的轻微变体,称为 circRB,以识别 circRNAs 与 RBPs 结合的序列特异性。在该模型中,circRNAs 的序列特征通过卷积运算提取,然后通过分析结合位点的卷积特征,采用胶囊网络中的两种动态路由算法来区分不同的结合位点。实验结果表明,circRB 方法优于现有的计算方法。然后,将训练好的模型应用于检测七个 circRNA-RBP 结合序列数据集上的序列基序,并与已知的人类 RNA 基序相匹配。一些 circRNA 上的基序与线性 RNA 上的基序重叠。最后,我们还预测了报告的与 RBPs 相互作用的 circRNAs 全长序列上的结合位点,试图辅助当前的研究。我们希望我们的模型将有助于更好地理解 RBPs 和 circRNAs 相互作用的机制。
鉴于对 circRNA 结合蛋白的序列特异性的研究较少,我们设计了一种基于胶囊网络的分类框架,称为 circRB。结果表明,circRB 方法是一种有效的方法,比其他方法具有更高的预测精度。