Mirnovo：利用决策森林从小RNA测序数据和单细胞中进行无基因组的微小RNA预测。

Mirnovo: genome-free prediction of microRNAs from small RNA sequencing data and single-cells using decision forests.

作者信息

Vitsios Dimitrios M, Kentepozidou Elissavet, Quintais Leonor, Benito-Gutiérrez Elia, van Dongen Stijn, Davis Matthew P, Enright Anton J

机构信息

European Molecular Biology Laboratory-European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK.

Department of Zoology, University of Cambridge, Downing Street, Cambridge CB2 3EJ, UK.

出版信息

Nucleic Acids Res. 2017 Dec 1;45(21):e177. doi: 10.1093/nar/gkx836.

DOI:10.1093/nar/gkx836

PMID:29036314

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5716205/

Abstract

The discovery of microRNAs (miRNAs) remains an important problem, particularly given the growth of high-throughput sequencing, cell sorting and single cell biology. While a large number of miRNAs have already been annotated, there may well be large numbers of miRNAs that are expressed in very particular cell types and remain elusive. Sequencing allows us to quickly and accurately identify the expression of known miRNAs from small RNA-Seq data. The biogenesis of miRNAs leads to very specific characteristics observed in their sequences. In brief, miRNAs usually have a well-defined 5' end and a more flexible 3' end with the possibility of 3' tailing events, such as uridylation. Previous approaches to the prediction of novel miRNAs usually involve the analysis of structural features of miRNA precursor hairpin sequences obtained from genome sequence. We surmised that it may be possible to identify miRNAs by using these biogenesis features observed directly from sequenced reads, solely or in addition to structural analysis from genome data. To this end, we have developed mirnovo, a machine learning based algorithm, which is able to identify known and novel miRNAs in animals and plants directly from small RNA-Seq data, with or without a reference genome. This method performs comparably to existing tools, however is simpler to use with reduced run time. Its performance and accuracy has been tested on multiple datasets, including species with poorly assembled genomes, RNaseIII (Drosha and/or Dicer) deficient samples and single cells (at both embryonic and adult stage).

摘要

微小RNA（miRNA）的发现仍然是一个重要问题，特别是考虑到高通量测序、细胞分选和单细胞生物学的发展。虽然已经注释了大量的miRNA，但很可能存在大量在非常特定的细胞类型中表达且仍难以捉摸的miRNA。测序使我们能够从小RNA测序数据中快速准确地识别已知miRNA的表达。miRNA的生物合成导致在其序列中观察到非常特定的特征。简而言之，miRNA通常具有明确的5'端和更灵活的3'端，可能发生3'端加尾事件，如尿苷化。以前预测新miRNA的方法通常涉及分析从基因组序列获得的miRNA前体发夹序列的结构特征。我们推测，有可能通过直接从测序读数中观察到的这些生物合成特征来识别miRNA，单独使用或结合基因组数据的结构分析。为此，我们开发了mirnovo，一种基于机器学习的算法，它能够直接从小RNA测序数据中识别动植物中的已知和新miRNA，无论有无参考基因组。该方法与现有工具的性能相当，但使用更简单，运行时间更短。它的性能和准确性已在多个数据集上进行了测试，包括基因组组装不佳的物种、RNaseIII（Drosha和/或Dicer）缺陷样本和单细胞（胚胎期和成年期）。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1307/5716205/0290efd888eb/gkx836fig1.jpg

相似文献

Mirnovo: genome-free prediction of microRNAs from small RNA sequencing data and single-cells using decision forests.

Nucleic Acids Res. 2017 Dec 1;45(21):e177. doi: 10.1093/nar/gkx836.

Mirinho: An efficient and general plant and animal pre-miRNA predictor for genomic and deep sequencing data.

BMC Bioinformatics. 2015 May 29;16:179. doi: 10.1186/s12859-015-0594-0.

BrumiR: A toolkit for de novo discovery of microRNAs from sRNA-seq data.

Gigascience. 2022 Oct 25;11. doi: 10.1093/gigascience/giac093.

iMir: an integrated pipeline for high-throughput analysis of small non-coding RNA data obtained by smallRNA-Seq.

BMC Bioinformatics. 2013 Dec 13;14:362. doi: 10.1186/1471-2105-14-362.

MirPlex: a tool for identifying miRNAs in high-throughput sRNA datasets without a genome.

J Exp Zool B Mol Dev Evol. 2013 Jan;320(1):47-56. doi: 10.1002/jez.b.22483. Epub 2012 Nov 26.

sRNAnalyzer-a flexible and customizable small RNA sequencing data analysis pipeline.

Nucleic Acids Res. 2017 Dec 1;45(21):12140-12151. doi: 10.1093/nar/gkx999.

miRdentify: high stringency miRNA predictor identifies several novel animal miRNAs.

Nucleic Acids Res. 2014;42(16):e124. doi: 10.1093/nar/gku598. Epub 2014 Jul 22.

microRPM: a microRNA prediction model based only on plant small RNA sequencing data.

Bioinformatics. 2018 Apr 1;34(7):1108-1115. doi: 10.1093/bioinformatics/btx725.

Genome-wide Mapping of DROSHA Cleavage Sites on Primary MicroRNAs and Noncanonical Substrates.

Mol Cell. 2017 Apr 20;66(2):258-269.e5. doi: 10.1016/j.molcel.2017.03.013.

Identification and profiling of novel microRNAs in the Brassica rapa genome based on small RNA deep sequencing.

BMC Plant Biol. 2012 Nov 19;12:218. doi: 10.1186/1471-2229-12-218.

引用本文的文献

Class-agnostic annotation of small RNAs balances sensitivity and specificity in diverse organisms.

Comput Struct Biotechnol J. 2025 May 27;27:2450-2459. doi: 10.1016/j.csbj.2025.05.045. eCollection 2025.

Analysis of microRNAs in response to eensis infection and their potential role in vectorial capacity.

Front Cell Infect Microbiol. 2024 Jul 17;14:1427562. doi: 10.3389/fcimb.2024.1427562. eCollection 2024.

Analysis of microRNAs in response to eensis infection and their potential role in vectorial capacity.

bioRxiv. 2024 May 6:2024.05.03.592465. doi: 10.1101/2024.05.03.592465.

MyBrain-Seq: A Pipeline for MiRNA-Seq Data Analysis in Neuropsychiatric Disorders.

Biomedicines. 2023 Apr 21;11(4):1230. doi: 10.3390/biomedicines11041230.

Non-coding RNAs in human health and disease: potential function as biomarkers and therapeutic targets.

Funct Integr Genomics. 2023 Jan 10;23(1):33. doi: 10.1007/s10142-022-00947-4.

A Unified Computational Framework for a Robust, Reliable, and Reproducible Identification of Novel miRNAs From the RNA Sequencing Data.

Front Bioinform. 2022 Jul 8;2:842051. doi: 10.3389/fbinf.2022.842051. eCollection 2022.

BrumiR: A toolkit for de novo discovery of microRNAs from sRNA-seq data.

Gigascience. 2022 Oct 25;11. doi: 10.1093/gigascience/giac093.

Differences in PLA Constitution Distinguish the Venom of Two Endemic Brazilian Mountain Lanceheads, and .

Toxins (Basel). 2022 Mar 25;14(4):237. doi: 10.3390/toxins14040237.

The Multiverse of Plant Small RNAs: How Can We Explore It?

Int J Mol Sci. 2022 Apr 2;23(7):3979. doi: 10.3390/ijms23073979.

Biogenesis, Functions, Interactions, and Resources of Non-Coding RNAs in Plants.

Int J Mol Sci. 2022 Mar 28;23(7):3695. doi: 10.3390/ijms23073695.

本文引用的文献

Single-cell sequencing of the small-RNA transcriptome.

Nat Biotechnol. 2016 Dec;34(12):1264-1266. doi: 10.1038/nbt.3701. Epub 2016 Oct 31.

VSEARCH: a versatile open source tool for metagenomics.

PeerJ. 2016 Oct 18;4:e2584. doi: 10.7717/peerj.2584. eCollection 2016.

Re-evaluation of the roles of DROSHA, Export in 5, and DICER in microRNA biogenesis.

Proc Natl Acad Sci U S A. 2016 Mar 29;113(13):E1881-9. doi: 10.1073/pnas.1602532113. Epub 2016 Mar 14.

Chimira: analysis of small RNA sequencing data and microRNA modifications.

Bioinformatics. 2015 Oct 15;31(20):3365-7. doi: 10.1093/bioinformatics/btv380. Epub 2015 Jun 20.

A Burst of miRNA Innovation in the Early Evolution of Butterflies and Moths.

Mol Biol Evol. 2015 May;32(5):1161-74. doi: 10.1093/molbev/msv004. Epub 2015 Jan 8.

Rfam 12.0: updates to the RNA families database.

Nucleic Acids Res. 2015 Jan;43(Database issue):D130-7. doi: 10.1093/nar/gku1063. Epub 2014 Nov 11.

Identification and expression profiling of Helicoverpa armigera microRNAs and their possible role in the regulation of digestive protease genes.

Insect Biochem Mol Biol. 2014 Nov;54:129-37. doi: 10.1016/j.ibmb.2014.09.008. Epub 2014 Sep 28.

Comparison of hepatocellular carcinoma miRNA expression profiling as evaluated by next generation sequencing and microarray.

PLoS One. 2014 Sep 12;9(9):e106314. doi: 10.1371/journal.pone.0106314. eCollection 2014.

Regulation of microRNA biogenesis.

Nat Rev Mol Cell Biol. 2014 Aug;15(8):509-24. doi: 10.1038/nrm3838. Epub 2014 Jul 16.

Evidence for the biogenesis of more than 1,000 novel human microRNAs.

Genome Biol. 2014 Apr 7;15(4):R57. doi: 10.1186/gb-2014-15-4-r57.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

Mirnovo：利用决策森林从小RNA测序数据和单细胞中进行无基因组的微小RNA预测。

Mirnovo: genome-free prediction of microRNAs from small RNA sequencing data and single-cells using decision forests.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献