IEEE Trans Nanobioscience. 2024 Oct;23(4):603-611. doi: 10.1109/TNB.2024.3454079. Epub 2024 Oct 15.
Circular RNAs (circRNAs) play a crucial role in gene regulation and association with diseases because of their unique closed continuous loop structure, which is more stable and conserved than ordinary linear RNAs. As fundamental work to clarify their functions, a large number of computational approaches for identifying circRNA formation have been proposed. However, these methods fail to fully utilize the important characteristics of back-splicing events, i.e., the positional information of the splice sites and the interaction features of its flanking sequences, for predicting circRNAs. To this end, we hereby propose a novel approach called SIDE for predicting circRNA back-splicing events using only raw RNA sequences. Technically, SIDE employs a dual encoder to capture global and interactive features of the RNA sequence, and then a decoder designed by the contrastive learning to fuse out discriminative features improving the prediction of circRNAs formation. Empirical results on three real-world datasets show the effectiveness of SIDE. Further analysis also reveals that the effectiveness of SIDE.
环状 RNA(circRNAs)由于其独特的封闭连续环结构,比普通线性 RNA 更稳定和保守,在基因调控和与疾病的关联中发挥着关键作用。作为阐明其功能的基础工作,已经提出了大量用于识别 circRNA 形成的计算方法。然而,这些方法未能充分利用剪接事件的重要特征,即剪接位点的位置信息及其侧翼序列的相互作用特征,来预测 circRNAs。为此,我们在此提出了一种名为 SIDE 的新方法,该方法仅使用原始 RNA 序列即可预测 circRNA 的反向剪接事件。从技术上讲,SIDE 使用双编码器来捕获 RNA 序列的全局和交互特征,然后使用对比学习设计解码器来融合出有区别的特征,从而提高 circRNA 形成的预测能力。在三个真实数据集上的实验结果表明了 SIDE 的有效性。进一步的分析还揭示了 SIDE 的有效性。