• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

BASiNET-生物序列 NETwork:一个关于编码和非编码 RNA 鉴定的案例研究。

BASiNET-BiologicAl Sequences NETwork: a case study on coding and non-coding RNAs identification.

机构信息

Department of Computer Science, Bioinformatics Graduate Program, Federal University of Technology - Paraná, Cornélio Procópio, PR 86300-000, Brazil.

Empresa Brasileira de Pesquisa Agropecuária, Embrapa Café, Brasília, DF 70770-901, Brazil.

出版信息

Nucleic Acids Res. 2018 Sep 19;46(16):e96. doi: 10.1093/nar/gky462.

DOI:10.1093/nar/gky462
PMID:29873784
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6144827/
Abstract

With the emergence of Next Generation Sequencing (NGS) technologies, a large volume of sequence data in particular de novo sequencing was rapidly produced at relatively low costs. In this context, computational tools are increasingly important to assist in the identification of relevant information to understand the functioning of organisms. This work introduces BASiNET, an alignment-free tool for classifying biological sequences based on the feature extraction from complex network measurements. The method initially transform the sequences and represents them as complex networks. Then it extracts topological measures and constructs a feature vector that is used to classify the sequences. The method was evaluated in the classification of coding and non-coding RNAs of 13 species and compared to the CNCI, PLEK and CPC2 methods. BASiNET outperformed all compared methods in all adopted organisms and datasets. BASiNET have classified sequences in all organisms with high accuracy and low standard deviation, showing that the method is robust and non-biased by the organism. The proposed methodology is implemented in open source in R language and freely available for download at https://cran.r-project.org/package=BASiNET.

摘要

随着下一代测序 (NGS) 技术的出现,大量的序列数据,特别是从头测序,以相对较低的成本快速产生。在这种情况下,计算工具对于辅助识别相关信息以了解生物体的功能变得越来越重要。本工作介绍了 BASiNET,这是一种基于从复杂网络测量中提取特征的、用于对生物序列进行分类的无比对工具。该方法首先将序列转换并表示为复杂网络。然后,它提取拓扑度量并构建特征向量,用于对序列进行分类。该方法在对 13 个物种的编码和非编码 RNA 的分类中进行了评估,并与 CNCI、PLEK 和 CPC2 方法进行了比较。在所有采用的生物体和数据集上,BASiNET 均优于所有比较方法。BASiNET 以高精度和低标准差对所有生物体中的序列进行了分类,表明该方法是稳健的,不受生物体的影响。所提出的方法学以 R 语言实现为开源,并可在 https://cran.r-project.org/package=BASiNET 上免费下载。

相似文献

1
BASiNET-BiologicAl Sequences NETwork: a case study on coding and non-coding RNAs identification.BASiNET-生物序列 NETwork:一个关于编码和非编码 RNA 鉴定的案例研究。
Nucleic Acids Res. 2018 Sep 19;46(16):e96. doi: 10.1093/nar/gky462.
2
PLEK: a tool for predicting long non-coding RNAs and messenger RNAs based on an improved k-mer scheme.PLEK:一种基于改进的k-mer方案预测长链非编码RNA和信使RNA的工具。
BMC Bioinformatics. 2014 Sep 19;15(1):311. doi: 10.1186/1471-2105-15-311.
3
CPC2: a fast and accurate coding potential calculator based on sequence intrinsic features.CPC2:一种基于序列固有特征的快速准确编码潜能计算器。
Nucleic Acids Res. 2017 Jul 3;45(W1):W12-W16. doi: 10.1093/nar/gkx428.
4
PLEKv2: predicting lncRNAs and mRNAs based on intrinsic sequence features and the coding-net model.PLEKv2:基于内在序列特征和编码网络模型预测 lncRNAs 和 mRNAs。
BMC Genomics. 2024 Aug 2;25(1):756. doi: 10.1186/s12864-024-10662-y.
5
lncRNA-MFDL: identification of human long non-coding RNAs by fusing multiple features and using deep learning.lncRNA-MFDL:通过融合多种特征并运用深度学习来鉴定人类长链非编码RNA
Mol Biosyst. 2015 Mar;11(3):892-7. doi: 10.1039/c4mb00650j. Epub 2015 Jan 15.
6
Reference-Based Identification of Long Noncoding RNAs in Plants with Strand-Specific RNA-Sequencing Data.利用链特异性RNA测序数据对植物中长链非编码RNA进行基于参考的鉴定
Methods Mol Biol. 2019;1933:245-255. doi: 10.1007/978-1-4939-9045-0_14.
7
De novo approach to classify protein-coding and noncoding transcripts based on sequence composition.基于序列组成对蛋白质编码和非编码转录本进行分类的从头开始方法。
Methods Mol Biol. 2014;1182:203-7. doi: 10.1007/978-1-4939-1062-5_18.
8
De Novo Plant Transcriptome Assembly and Annotation Using Illumina RNA-Seq Reads.利用Illumina RNA测序读数进行从头植物转录组组装与注释
Methods Mol Biol. 2019;1933:265-275. doi: 10.1007/978-1-4939-9045-0_16.
9
NAMS: Noncoding Assessment of long RNAs in Magnoliophyta Species.NAMS:被子植物物种中长链RNA的非编码评估
Methods Mol Biol. 2019;1933:257-264. doi: 10.1007/978-1-4939-9045-0_15.
10
Methods to Study Long Noncoding RNA Expression and Dynamics in Zebrafish Using RNA Sequencing.利用RNA测序研究斑马鱼中长链非编码RNA表达及动态变化的方法
Methods Mol Biol. 2019;1912:77-110. doi: 10.1007/978-1-4939-8982-9_4.

引用本文的文献

1
GRAMEP: an alignment-free method based on the maximum entropy principle for identifying SNPs.GRAMEP:一种基于最大熵原理的无比对单核苷酸多态性识别方法。
BMC Bioinformatics. 2025 Feb 25;26(1):66. doi: 10.1186/s12859-025-06037-z.
2
Be-dataHIVE: a base editing database.Be-dataHIVE:碱基编辑数据库。
BMC Bioinformatics. 2024 Oct 15;25(1):330. doi: 10.1186/s12859-024-05898-0.
3
Challenges in LncRNA Biology: Views and Opinions.长链非编码RNA生物学中的挑战:观点与见解

本文引用的文献

1
HiMMe: using genetic patterns as a proxy for genome assembly reliability assessment.HiMMe:利用遗传模式作为基因组组装可靠性评估的替代指标。
BMC Genomics. 2017 Sep 5;18(1):694. doi: 10.1186/s12864-017-3965-2.
2
CPC2: a fast and accurate coding potential calculator based on sequence intrinsic features.CPC2:一种基于序列固有特征的快速准确编码潜能计算器。
Nucleic Acids Res. 2017 Jul 3;45(W1):W12-W16. doi: 10.1093/nar/gkx428.
3
UniProt: the universal protein knowledgebase.通用蛋白质知识库:UniProt
Noncoding RNA. 2024 Aug 1;10(4):43. doi: 10.3390/ncrna10040043.
4
MFPINC: prediction of plant ncRNAs based on multi-source feature fusion.MFPINC:基于多源特征融合的植物 ncRNAs 预测。
BMC Genomics. 2024 May 30;25(1):531. doi: 10.1186/s12864-024-10439-3.
5
Translation of Circular RNAs: Functions of Translated Products and Related Bioinformatics Approaches.环状RNA的翻译:翻译产物的功能及相关生物信息学方法
Curr Bioinform. 2024;19(1):3-13. doi: 10.2174/1574893618666230505101059. Epub 2023 Oct 3.
6
Linking discoveries, mechanisms, and technologies to develop a clearer perspective on plant long noncoding RNAs.将发现、机制和技术联系起来,以更清晰地了解植物长非编码 RNA。
Plant Cell. 2023 May 29;35(6):1762-1786. doi: 10.1093/plcell/koad027.
7
A large-scale benchmark study of tools for the classification of protein-coding and non-coding RNAs.大规模基准研究工具用于蛋白质编码和非编码 RNA 的分类。
Nucleic Acids Res. 2022 Nov 28;50(21):12094-12111. doi: 10.1093/nar/gkac1092.
8
PINC: A Tool for Non-Coding RNA Identification in Plants Based on an Automated Machine Learning Framework.PINC:基于自动化机器学习框架的植物非编码 RNA 鉴定工具。
Int J Mol Sci. 2022 Oct 5;23(19):11825. doi: 10.3390/ijms231911825.
9
LncRNAs in neuropsychiatric disorders and computational insights for their prediction.神经精神疾病中的长链非编码RNA及其预测的计算见解
Mol Biol Rep. 2022 Dec;49(12):11515-11534. doi: 10.1007/s11033-022-07819-x. Epub 2022 Sep 12.
10
Genome-Wide Identification and Characterization of Long Non-Coding RNAs in Longissimus dorsi Skeletal Muscle of Shandong Black Cattle and Luxi Cattle.山东黑牛和鲁西黄牛背最长肌中长链非编码RNA的全基因组鉴定与特征分析
Front Genet. 2022 May 16;13:849399. doi: 10.3389/fgene.2022.849399. eCollection 2022.
Nucleic Acids Res. 2017 Jan 4;45(D1):D158-D169. doi: 10.1093/nar/gkw1099. Epub 2016 Nov 29.
4
Next-generation biology: Sequencing and data analysis approaches for non-model organisms.下一代生物学:非模式生物的测序与数据分析方法
Mar Genomics. 2016 Dec;30:3-13. doi: 10.1016/j.margen.2016.04.012. Epub 2016 May 13.
5
Coming of age: ten years of next-generation sequencing technologies.成年:下一代测序技术的十年
Nat Rev Genet. 2016 May 17;17(6):333-51. doi: 10.1038/nrg.2016.49.
6
NONCODE 2016: an informative and valuable data source of long non-coding RNAs.NONCODE 2016:一个关于长链非编码RNA的信息丰富且有价值的数据源。
Nucleic Acids Res. 2016 Jan 4;44(D1):D203-8. doi: 10.1093/nar/gkv1252. Epub 2015 Nov 19.
7
PLEK: a tool for predicting long non-coding RNAs and messenger RNAs based on an improved k-mer scheme.PLEK:一种基于改进的k-mer方案预测长链非编码RNA和信使RNA的工具。
BMC Bioinformatics. 2014 Sep 19;15(1):311. doi: 10.1186/1471-2105-15-311.
8
Utilizing sequence intrinsic composition to classify protein-coding and long non-coding transcripts.利用序列固有组成对蛋白编码和长非编码转录本进行分类。
Nucleic Acids Res. 2013 Sep;41(17):e166. doi: 10.1093/nar/gkt646. Epub 2013 Jul 27.
9
LncRNADisease: a database for long-non-coding RNA-associated diseases.LncRNADisease:一个长非编码 RNA 相关疾病数据库。
Nucleic Acids Res. 2013 Jan;41(Database issue):D983-6. doi: 10.1093/nar/gks1099. Epub 2012 Nov 21.
10
A complex network framework for unbiased statistical analyses of DNA-DNA contact maps.一种用于 DNA-DNA 接触图谱无偏统计分析的复杂网络框架。
Nucleic Acids Res. 2013 Jan;41(2):701-10. doi: 10.1093/nar/gks1096. Epub 2012 Nov 21.