• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

NCResNet:基于核糖核酸序列深度驻留网络的非编码核糖核酸预测

NCResNet: Noncoding Ribonucleic Acid Prediction Based on a Deep Resident Network of Ribonucleic Acid Sequences.

作者信息

Yang Sen, Wang Yan, Zhang Shuangquan, Hu Xuemei, Ma Qin, Tian Yuan

机构信息

Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, and College of Computer Science and Technology, Jilin University, Changchun, China.

School of Artificial Intelligence, Jilin University, Changchun, China.

出版信息

Front Genet. 2020 Feb 28;11:90. doi: 10.3389/fgene.2020.00090. eCollection 2020.

DOI:10.3389/fgene.2020.00090
PMID:32180792
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7059790/
Abstract

Noncoding RNA (ncRNA) is a kind of RNA that plays an important role in many biological processes, diseases, and cancers, while cannot translate into proteins. With the development of next-generation sequence technology, thousands of novel RNAs with long open reading frames (ORFs, longest ORF length > 303 nt) and short ORFs (longest ORF length ≤ 303 nt) have been discovered in a short time. How to identify ncRNAs more precisely from novel unannotated RNAs is an important step for RNA functional analysis, RNA regulation, . However, most previous methods only utilize the information of sequence features. Meanwhile, most of them have focused on long-ORF RNA sequences, but not adapted to short-ORF RNA sequences. In this paper, we propose a new reliable method called NCResNet. NCResNet employs 57 hybrid features of four categories as inputs, including sequence, protein, RNA structure, and RNA physicochemical properties, and introduces feature enhancement and deep feature learning policies in a neural net model to adapt to this problem. The experiments on benchmark datasets of 8 species shows NCResNet has higher accuracy and higher Matthews correlation coefficient (MCC) compared with other state-of-the-art methods. Particularly, on four short-ORF RNA sequence datasets, specifically mouse, , zebrafish, and cow, NCResNet achieves greater than 10 and 15% improvements over other state-of-the-art methods in terms of accuracy and MCC. Meanwhile, for long-ORF RNA sequence datasets, NCResNet also has better accuracy and MCC than other state-of-the-art methods on most test datasets. Codes and data are available at https://github.com/abcair/NCResNet.

摘要

非编码RNA(ncRNA)是一类RNA,它在许多生物过程、疾病和癌症中发挥着重要作用,但不能翻译成蛋白质。随着下一代测序技术的发展,在短时间内发现了数千种具有长开放阅读框(ORF,最长ORF长度>303 nt)和短开放阅读框(最长ORF长度≤303 nt)的新型RNA。如何从新的未注释RNA中更精确地识别ncRNA是RNA功能分析、RNA调控的重要一步。然而,以前的大多数方法只利用序列特征信息。同时,它们中的大多数都集中在长ORF RNA序列上,而不适用于短ORF RNA序列。在本文中,我们提出了一种新的可靠方法,称为NCResNet。NCResNet采用四类57种混合特征作为输入,包括序列、蛋白质、RNA结构和RNA理化性质,并在神经网络模型中引入特征增强和深度特征学习策略来适应这一问题。对8个物种的基准数据集进行的实验表明,与其他现有方法相比,NCResNet具有更高的准确率和更高的马修斯相关系数(MCC)。特别是,在四个短ORF RNA序列数据集上,即小鼠、斑马鱼和牛,NCResNet在准确率和MCC方面比其他现有方法有超过10%和15%的提升。同时,对于长ORF RNA序列数据集,在大多数测试数据集上,NCResNet也比其他现有方法具有更好的准确率和MCC。代码和数据可在https://github.com/abcair/NCResNet上获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/940a/7059790/8c0f61d871e9/fgene-11-00090-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/940a/7059790/c774eba8e9f6/fgene-11-00090-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/940a/7059790/9b9ae5c888e8/fgene-11-00090-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/940a/7059790/cc6ed68a7408/fgene-11-00090-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/940a/7059790/026588c97a8e/fgene-11-00090-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/940a/7059790/61b2d00fde35/fgene-11-00090-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/940a/7059790/8c0f61d871e9/fgene-11-00090-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/940a/7059790/c774eba8e9f6/fgene-11-00090-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/940a/7059790/9b9ae5c888e8/fgene-11-00090-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/940a/7059790/cc6ed68a7408/fgene-11-00090-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/940a/7059790/026588c97a8e/fgene-11-00090-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/940a/7059790/61b2d00fde35/fgene-11-00090-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/940a/7059790/8c0f61d871e9/fgene-11-00090-g006.jpg

相似文献

1
NCResNet: Noncoding Ribonucleic Acid Prediction Based on a Deep Resident Network of Ribonucleic Acid Sequences.NCResNet:基于核糖核酸序列深度驻留网络的非编码核糖核酸预测
Front Genet. 2020 Feb 28;11:90. doi: 10.3389/fgene.2020.00090. eCollection 2020.
2
BoT-Net: a lightweight bag of tricks-based neural network for efficient LncRNA-miRNA interaction prediction.BoT-Net:一种基于轻量级技巧的神经网络,用于高效的 LncRNA-miRNA 相互作用预测。
Interdiscip Sci. 2022 Dec;14(4):841-862. doi: 10.1007/s12539-022-00535-x. Epub 2022 Aug 10.
3
LncCat: An ORF attention model to identify LncRNA based on ensemble learning strategy and fused sequence information.LncCat:一种基于集成学习策略和融合序列信息来识别长链非编码RNA的开放阅读框注意力模型。
Comput Struct Biotechnol J. 2023 Feb 8;21:1433-1447. doi: 10.1016/j.csbj.2023.02.012. eCollection 2023.
4
PLEKv2: predicting lncRNAs and mRNAs based on intrinsic sequence features and the coding-net model.PLEKv2:基于内在序列特征和编码网络模型预测 lncRNAs 和 mRNAs。
BMC Genomics. 2024 Aug 2;25(1):756. doi: 10.1186/s12864-024-10662-y.
5
Predicting protein-ligand binding residues with deep convolutional neural networks.使用深度卷积神经网络预测蛋白质-配体结合残基。
BMC Bioinformatics. 2019 Feb 26;20(1):93. doi: 10.1186/s12859-019-2672-1.
6
MiPepid: MicroPeptide identification tool using machine learning.MiPepid:基于机器学习的微肽鉴定工具。
BMC Bioinformatics. 2019 Nov 8;20(1):559. doi: 10.1186/s12859-019-3033-9.
7
DeepCPP: a deep neural network based on nucleotide bias information and minimum distribution similarity feature selection for RNA coding potential prediction.DeepCPP:一种基于核苷酸偏差信息和最小分布相似性特征选择的深度神经网络,用于 RNA 编码潜力预测。
Brief Bioinform. 2021 Mar 22;22(2):2073-2084. doi: 10.1093/bib/bbaa039.
8
[Analysis, identification and correction of some errors of model refseqs appeared in NCBI Human Gene Database by in silico cloning and experimental verification of novel human genes].[通过新型人类基因的电子克隆和实验验证对NCBI人类基因数据库中出现的模型参考序列的一些错误进行分析、鉴定和校正]
Yi Chuan Xue Bao. 2004 May;31(5):431-43.
9
NPI-GNN: Predicting ncRNA-protein interactions with deep graph neural networks.NPI-GNN:利用深度图神经网络预测 ncRNA-蛋白质相互作用。
Brief Bioinform. 2021 Sep 2;22(5). doi: 10.1093/bib/bbab051.
10
Predicting Coding Potential of RNA Sequences by Solving Local Data Imbalance.通过解决局部数据不平衡来预测 RNA 序列的编码潜力。
IEEE/ACM Trans Comput Biol Bioinform. 2022 Mar-Apr;19(2):1075-1083. doi: 10.1109/TCBB.2020.3021800. Epub 2022 Apr 1.

引用本文的文献

1
Enhancing LncRNA-miRNA interaction prediction with multimodal contrastive representation learning.通过多模态对比表示学习增强长链非编码RNA-微小RNA相互作用预测
Brief Bioinform. 2025 May 1;26(3). doi: 10.1093/bib/bbaf281.
2
SORFPP: Enhancing rich sequence-driven information to identify SEPs based on fused framework on validation datasets.SORFPP:在验证数据集上基于融合框架增强丰富的序列驱动信息以识别SEP
PLoS One. 2025 Apr 28;20(4):e0320314. doi: 10.1371/journal.pone.0320314. eCollection 2025.
3
A task-specific encoding algorithm for RNAs and RNA-associated interactions based on convolutional autoencoder.

本文引用的文献

1
Identification of lncRNAs-gene interactions in transcription regulation based on co-expression analysis of RNA-seq data.基于 RNA-seq 数据的共表达分析鉴定转录调控中的 lncRNA-基因相互作用。
Math Biosci Eng. 2019 Aug 5;16(6):7112-7125. doi: 10.3934/mbe.2019357.
2
An expanded landscape of human long noncoding RNA.人类长非编码 RNA 的扩展景观。
Nucleic Acids Res. 2019 Sep 5;47(15):7842-7856. doi: 10.1093/nar/gkz621.
3
Using Machine Learning to Measure Relatedness Between Genes: A Multi-Features Model.使用机器学习测量基因间的相关性:一种多特征模型。
基于卷积自动编码器的 RNA 及其相关相互作用的特定任务编码算法。
Nucleic Acids Res. 2023 Nov 27;51(21):e110. doi: 10.1093/nar/gkad929.
4
RNAincoder: a deep learning-based encoder for RNA and RNA-associated interaction.RNAincoder:一种基于深度学习的 RNA 及其相关相互作用的编码器。
Nucleic Acids Res. 2023 Jul 5;51(W1):W509-W519. doi: 10.1093/nar/gkad404.
5
A large-scale benchmark study of tools for the classification of protein-coding and non-coding RNAs.大规模基准研究工具用于蛋白质编码和非编码 RNA 的分类。
Nucleic Acids Res. 2022 Nov 28;50(21):12094-12111. doi: 10.1093/nar/gkac1092.
6
Advances in Computational Methodologies for Classification and Sub-Cellular Locality Prediction of Non-Coding RNAs.计算方法在非编码 RNA 分类和亚细胞定位预测中的研究进展。
Int J Mol Sci. 2021 Aug 13;22(16):8719. doi: 10.3390/ijms22168719.
7
The computational approaches of lncRNA identification based on coding potential: and challenges.基于编码潜能的lncRNA识别计算方法及挑战
Comput Struct Biotechnol J. 2020 Nov 19;18:3666-3677. doi: 10.1016/j.csbj.2020.11.030. eCollection 2020.
8
LncMirNet: Predicting LncRNA-miRNA Interaction Based on Deep Learning of Ribonucleic Acid Sequences.LncMirNet:基于 RNA 序列深度学习的长非编码 RNA- miRNA 相互作用预测。
Molecules. 2020 Sep 23;25(19):4372. doi: 10.3390/molecules25194372.
Sci Rep. 2019 Mar 12;9(1):4192. doi: 10.1038/s41598-019-40780-7.
4
CPPred: coding potential prediction based on the global description of RNA sequence.CPPred:基于 RNA 序列全局描述的编码潜能预测。
Nucleic Acids Res. 2019 May 7;47(8):e43. doi: 10.1093/nar/gkz087.
5
A New Machine Learning-Based Framework for Mapping Uncertainty Analysis in RNA-Seq Read Alignment and Gene Expression Estimation.一种基于机器学习的新框架,用于RNA测序读段比对和基因表达估计中的不确定性分析映射。
Front Genet. 2018 Aug 14;9:313. doi: 10.3389/fgene.2018.00313. eCollection 2018.
6
LncFinder: an integrated platform for long non-coding RNA identification utilizing sequence intrinsic composition, structural information and physicochemical property.LncFinder:一个综合平台,利用序列固有组成、结构信息和物理化学性质来鉴定长非编码 RNA。
Brief Bioinform. 2019 Nov 27;20(6):2009-2027. doi: 10.1093/bib/bby065.
7
MiCEE is a ncRNA-protein complex that mediates epigenetic silencing and nucleolar organization.MiCEE 是一个 ncRNA-蛋白质复合物,介导表观遗传沉默和核仁组织。
Nat Genet. 2018 Jul;50(7):990-1001. doi: 10.1038/s41588-018-0139-3. Epub 2018 Jun 4.
8
LncRNAnet: long non-coding RNA identification using deep learning.LncRNAnet:使用深度学习进行长非编码 RNA 鉴定。
Bioinformatics. 2018 Nov 15;34(22):3889-3897. doi: 10.1093/bioinformatics/bty418.
9
Applications of Support Vector Machine (SVM) Learning in Cancer Genomics.支持向量机(SVM)学习在癌症基因组学中的应用。
Cancer Genomics Proteomics. 2018 Jan-Feb;15(1):41-51. doi: 10.21873/cgp.20063.
10
Non-coding RNA networks in cancer.癌症中的非编码RNA网络
Nat Rev Cancer. 2018 Jan;18(1):5-18. doi: 10.1038/nrc.2017.99. Epub 2017 Nov 24.