ncDENSE：一种基于深度学习框架的新型计算方法，用于预测非编码 RNA 家族。

ncDENSE: a novel computational method based on a deep learning framework for non-coding RNAs family prediction.

机构信息

College of Software, Jilin University, Changchun, 130012, China.

Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, 130012, China.

出版信息

BMC Bioinformatics. 2023 Feb 27;24(1):68. doi: 10.1186/s12859-023-05191-6.

DOI:10.1186/s12859-023-05191-6

PMID:36849908

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9972773/

Abstract

BACKGROUND

Although research on non-coding RNAs (ncRNAs) is a hot topic in life sciences, the functions of numerous ncRNAs remain unclear. In recent years, researchers have found that ncRNAs of the same family have similar functions, therefore, it is important to accurately predict ncRNAs families to identify their functions. There are several methods available to solve the prediction problem of ncRNAs family, whose main ideas can be divided into two categories, including prediction based on the secondary structure features of ncRNAs, and prediction according to sequence features of ncRNAs. The first type of prediction method requires a complicated process and has a low accuracy in obtaining the secondary structure of ncRNAs, while the second type of method has a simple prediction process and a high accuracy, but there is still room for improvement. The existing methods for ncRNAs family prediction are associated with problems such as complicated prediction processes and low accuracy, in this regard, it is necessary to propose a new method to predict the ncRNAs family more perfectly.

RESULTS

A deep learning model-based method, ncDENSE, was proposed in this study, which predicted ncRNAs families by extracting ncRNAs sequence features. The bases in ncRNAs sequences were encoded by one-hot coding and later fed into an ensemble deep learning model, which contained the dynamic bi-directional gated recurrent unit (Bi-GRU), the dense convolutional network (DenseNet), and the Attention Mechanism (AM). To be specific, dynamic Bi-GRU was used to extract contextual feature information and capture long-term dependencies of ncRNAs sequences. AM was employed to assign different weights to features extracted by Bi-GRU and focused the attention on information with greater weights. Whereas DenseNet was adopted to extract local feature information of ncRNAs sequences and classify them by the full connection layer. According to our results, the ncDENSE method improved the Accuracy, Sensitivity, Precision, F-score, and MCC by 2.08[Formula: see text], 2.33[Formula: see text], 2.14[Formula: see text], 2.16[Formula: see text], and 2.39[Formula: see text], respectively, compared with the suboptimal method.

CONCLUSIONS

Overall, the ncDENSE method proposed in this paper extracts sequence features of ncRNAs by dynamic Bi-GRU and DenseNet and improves the accuracy in predicting ncRNAs family and other data.

摘要

背景

尽管非编码 RNA（ncRNA）的研究是生命科学的一个热门话题，但许多 ncRNA 的功能仍不清楚。近年来，研究人员发现，同一家族的 ncRNA 具有相似的功能，因此，准确预测 ncRNA 家族以识别其功能非常重要。有几种方法可用于解决 ncRNA 家族的预测问题，其主要思路可分为两类，包括基于 ncRNA 二级结构特征的预测和基于 ncRNA 序列特征的预测。第一种预测方法需要一个复杂的过程，并且在获取 ncRNA 二级结构方面准确性较低，而第二种方法的预测过程简单，准确性较高，但仍有改进的空间。现有的 ncRNA 家族预测方法存在预测过程复杂、准确性低等问题，因此有必要提出一种新的方法来更完美地预测 ncRNA 家族。

结果

本研究提出了一种基于深度学习模型的方法 ncDENSE，通过提取 ncRNA 序列特征来预测 ncRNA 家族。ncRNA 序列中的碱基采用 one-hot 编码，然后输入到一个集成深度学习模型中，该模型包含动态双向门控循环单元（Bi-GRU）、密集卷积网络（DenseNet）和注意力机制（AM）。具体来说，动态 Bi-GRU 用于提取 ncRNA 序列的上下文特征信息，并捕获 ncRNA 序列的长期依赖关系。AM 用于为 Bi-GRU 提取的特征分配不同的权重，并将注意力集中在权重较大的信息上。而 DenseNet 用于提取 ncRNA 序列的局部特征信息，并通过全连接层对其进行分类。根据我们的结果，与次优方法相比，ncDENSE 方法将 Accuracy、Sensitivity、Precision、F-score 和 MCC 分别提高了 2.08[Formula: see text]、2.33[Formula: see text]、2.14[Formula: see text]、2.16[Formula: see text]和 2.39[Formula: see text]。

结论

总体而言，本文提出的 ncDENSE 方法通过动态 Bi-GRU 和 DenseNet 提取 ncRNA 的序列特征，提高了 ncRNA 家族及其他数据的预测准确性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5f37/9972773/112eee18373b/12859_2023_5191_Fig1_HTML.jpg

相似文献

ncDENSE: a novel computational method based on a deep learning framework for non-coding RNAs family prediction.ncDENSE：一种基于深度学习框架的新型计算方法，用于预测非编码 RNA 家族。

BMC Bioinformatics. 2023 Feb 27;24(1):68. doi: 10.1186/s12859-023-05191-6.

ncDLRES: a novel method for non-coding RNAs family prediction based on dynamic LSTM and ResNet.ncDLRES：一种基于动态 LSTM 和 ResNet 的新型非编码 RNA 家族预测方法。

BMC Bioinformatics. 2021 Sep 20;22(1):447. doi: 10.1186/s12859-021-04365-4.

ncRFP: A Novel end-to-end Method for Non-Coding RNAs Family Prediction Based on Deep Learning.ncRFP：一种基于深度学习的新型非编码 RNA 家族预测端到端方法。

IEEE/ACM Trans Comput Biol Bioinform. 2021 Mar-Apr;18(2):784-789. doi: 10.1109/TCBB.2020.2982873. Epub 2021 Apr 6.

MFPred: prediction of ncRNA families based on multi-feature fusion.MFPred：基于多特征融合的 ncRNA 家族预测。

Brief Bioinform. 2023 Sep 20;24(5). doi: 10.1093/bib/bbad303.

Deep forest ensemble learning for classification of alignments of non-coding RNA sequences based on multi-view structure representations.基于多视图结构表示的非编码 RNA 序列比对分类的深度森林集成学习。

Brief Bioinform. 2021 Jul 20;22(4). doi: 10.1093/bib/bbaa354.

An automated framework for evaluation of deep learning models for splice site predictions.用于评估深度学习模型进行剪接位点预测的自动化框架。

Sci Rep. 2023 Jun 23;13(1):10221. doi: 10.1038/s41598-023-34795-4.

im5C-DSCGA: A Proposed Hybrid Framework Based on Improved DenseNet and Attention Mechanisms for Identifying 5-methylcytosine Sites in Human RNA.im5C-DSCGA：一种基于改进的 DenseNet 和注意力机制的混合框架，用于识别人类 RNA 中的 5-甲基胞嘧啶位点。

Front Biosci (Landmark Ed). 2023 Dec 26;28(12):346. doi: 10.31083/j.fbl2812346.

Application of TCN-biGRU neural network in [Formula: see text] concentration prediction.TCN-biGRU 神经网络在[公式：见正文]浓度预测中的应用。

Environ Sci Pollut Res Int. 2023 Dec;30(56):119506-119517. doi: 10.1007/s11356-023-30354-6. Epub 2023 Nov 6.

EDLMFC: an ensemble deep learning framework with multi-scale features combination for ncRNA-protein interaction prediction.EDLMFC：一种具有多尺度特征组合的集成深度学习框架，用于 ncRNA-蛋白质相互作用预测。

BMC Bioinformatics. 2021 Mar 19;22(1):133. doi: 10.1186/s12859-021-04069-9.

DLC-ac4C: A Prediction Model for N4-acetylcytidine Sites in Human mRNA Based on DenseNet and Bidirectional LSTM Methods.DLC-ac4C：一种基于密集连接网络（DenseNet）和双向长短期记忆网络（Bidirectional LSTM）方法的人类mRNA中N4-乙酰胞苷位点预测模型

Curr Genomics. 2023 Nov 22;24(3):171-186. doi: 10.2174/0113892029270191231013111911.

引用本文的文献

Improving ncRNA family prediction using multi-modal contrastive learning of sequence and structure.利用序列和结构的多模态对比学习改进 ncRNA 家族预测。

Bioinformatics. 2024 Nov 1;40(11). doi: 10.1093/bioinformatics/btae640.

Comparison and benchmark of deep learning methods for non-coding RNA classification.深度学习方法在非编码 RNA 分类中的比较和基准测试。

PLoS Comput Biol. 2024 Sep 12;20(9):e1012446. doi: 10.1371/journal.pcbi.1012446. eCollection 2024 Sep.

Transformer Architecture and Attention Mechanisms in Genome Data Analysis: A Comprehensive Review.基因组数据分析中的Transformer架构与注意力机制：全面综述

Biology (Basel). 2023 Jul 22;12(7):1033. doi: 10.3390/biology12071033.

本文引用的文献

ncDLRES: a novel method for non-coding RNAs family prediction based on dynamic LSTM and ResNet.ncDLRES：一种基于动态 LSTM 和 ResNet 的新型非编码 RNA 家族预测方法。

BMC Bioinformatics. 2021 Sep 20;22(1):447. doi: 10.1186/s12859-021-04365-4.

Noncoding RNA therapeutics - challenges and potential solutions.非编码 RNA 治疗学——挑战与潜在解决方案。

Nat Rev Drug Discov. 2021 Aug;20(8):629-651. doi: 10.1038/s41573-021-00219-z. Epub 2021 Jun 18.

Deep learning predicts short non-coding RNA functions from only raw sequence data.深度学习仅从原始序列数据预测短非编码 RNA 功能。

PLoS Comput Biol. 2020 Nov 11;16(11):e1008415. doi: 10.1371/journal.pcbi.1008415. eCollection 2020 Nov.

mRNAs, proteins and the emerging principles of gene expression control.mRNA、蛋白质和基因表达控制的新兴原则。

Nat Rev Genet. 2020 Oct;21(10):630-644. doi: 10.1038/s41576-020-0258-4. Epub 2020 Jul 24.

ncRFP: A Novel end-to-end Method for Non-Coding RNAs Family Prediction Based on Deep Learning.ncRFP：一种基于深度学习的新型非编码 RNA 家族预测端到端方法。

IEEE/ACM Trans Comput Biol Bioinform. 2021 Mar-Apr;18(2):784-789. doi: 10.1109/TCBB.2020.2982873. Epub 2021 Apr 6.

The Effect of Human Genome Annotation Complexity on RNA-Seq Gene Expression Quantification.人类基因组注释复杂性对RNA测序基因表达定量的影响

IEEE Int Conf Bioinform Biomed Workshops. 2012 Oct;2012:712-717. doi: 10.1109/BIBMW.2012.6470224.

Classic Spotlight: Regulatory Function of Leader RNAs.经典聚焦：前导RNA的调控功能。

J Bacteriol. 2016 Feb 12;198(5):743. doi: 10.1128/JB.00947-15. Print 2016 Mar.

Non-coding RNA: what is functional and what is junk?非编码RNA：何为功能性的，何为无功能的？

Front Genet. 2015 Jan 26;6:2. doi: 10.3389/fgene.2015.00002. eCollection 2015.

Prediction and classification of ncRNAs using structural information.利用结构信息对非编码RNA进行预测和分类。

BMC Genomics. 2014 Feb 13;15:127. doi: 10.1186/1471-2164-15-127.

IRES-mediated translation of membrane proteins and glycoproteins in eukaryotic cell-free systems.内部核糖体进入位点介导的真核无细胞系统中膜蛋白和糖蛋白的翻译

PLoS One. 2013 Dec 20;8(12):e82234. doi: 10.1371/journal.pone.0082234. eCollection 2013.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

ncDENSE：一种基于深度学习框架的新型计算方法，用于预测非编码 RNA 家族。

ncDENSE: a novel computational method based on a deep learning framework for non-coding RNAs family prediction.

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSIONS

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献