Suppr超能文献

CircSSNN:使用带有预归一化的序列自注意力神经网络进行 circRNA 结合位点预测。

CircSSNN: circRNA-binding site prediction via sequence self-attention neural networks with pre-normalization.

机构信息

School of Computer Science and Technology, Guangxi University of Science and Technology, Liuzhou, China.

Key Laboratory of Guangxi Universities on Intelligent Computing and Distributed Information Processing, Guangxi University of Science and Technology, Liuzhou, China.

出版信息

BMC Bioinformatics. 2023 May 30;24(1):220. doi: 10.1186/s12859-023-05352-7.

Abstract

BACKGROUND

Circular RNAs (circRNAs) play a significant role in some diseases by acting as transcription templates. Therefore, analyzing the interaction mechanism between circRNA and RNA-binding proteins (RBPs) has far-reaching implications for the prevention and treatment of diseases. Existing models for circRNA-RBP identification usually adopt convolution neural network (CNN), recurrent neural network (RNN), or their variants as feature extractors. Most of them have drawbacks such as poor parallelism, insufficient stability, and inability to capture long-term dependencies.

METHODS

In this paper, we propose a new method completely using the self-attention mechanism to capture deep semantic features of RNA sequences. On this basis, we construct a CircSSNN model for the cirRNA-RBP identification. The proposed model constructs a feature scheme by fusing circRNA sequence representations with statistical distributions, static local contexts, and dynamic global contexts. With a stable and efficient network architecture, the distance between any two positions in a sequence is reduced to a constant, so CircSSNN can quickly capture the long-term dependencies and extract the deep semantic features.

RESULTS

Experiments on 37 circRNA datasets show that the proposed model has overall advantages in stability, parallelism, and prediction performance. Keeping the network structure and hyperparameters unchanged, we directly apply the CircSSNN to linRNA datasets. The favorable results show that CircSSNN can be transformed simply and efficiently without task-oriented tuning.

CONCLUSIONS

In conclusion, CircSSNN can serve as an appealing circRNA-RBP identification tool with good identification performance, excellent scalability, and wide application scope without the need for task-oriented fine-tuning of parameters, which is expected to reduce the professional threshold required for hyperparameter tuning in bioinformatics analysis.

摘要

背景

环状 RNA(circRNA)作为转录模板在一些疾病中发挥重要作用。因此,分析 circRNA 与 RNA 结合蛋白(RBP)的相互作用机制对疾病的预防和治疗具有深远意义。现有的 circRNA-RBP 识别模型通常采用卷积神经网络(CNN)、递归神经网络(RNN)或它们的变体作为特征提取器。它们大多存在并行性差、稳定性不足、无法捕获长时依赖等缺点。

方法

本文提出了一种完全使用自注意力机制的新方法,用于捕获 RNA 序列的深层语义特征。在此基础上,我们构建了一个用于 circRNA-RBP 识别的 CircSSNN 模型。该模型通过融合 circRNA 序列表示与统计分布、静态局部上下文和动态全局上下文来构建特征方案。CircSSNN 具有稳定高效的网络架构,可将序列中任意两个位置之间的距离缩小到一个常数,从而能够快速捕获长时依赖并提取深层语义特征。

结果

在 37 个 circRNA 数据集上的实验表明,所提出的模型在稳定性、并行性和预测性能方面具有整体优势。在保持网络结构和超参数不变的情况下,我们直接将 CircSSNN 应用于 linRNA 数据集。有利的结果表明,CircSSNN 可以简单有效地转换,无需面向任务的参数调整。

结论

总之,CircSSNN 可以作为一种有吸引力的 circRNA-RBP 识别工具,具有良好的识别性能、出色的可扩展性和广泛的应用范围,无需面向任务的参数微调,有望降低生物信息学分析中对超参数调整的专业门槛。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5dc7/10230723/3ea0df907bf4/12859_2023_5352_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验