Institute of Neuroscience, University of Oregon, Eugene, OR 97403, USA.
Computational Medicine Center, Thomas Jefferson University, Philadelphia, PA 19144, USA.
Bioinformatics. 2020 Feb 1;36(3):698-703. doi: 10.1093/bioinformatics/btz675.
MicroRNAs (miRNAs) are small RNA molecules (∼22 nucleotide long) involved in post-transcriptional gene regulation. Advances in high-throughput sequencing technologies led to the discovery of isomiRs, which are miRNA sequence variants. While many miRNA-seq analysis tools exist, the diversity of output formats hinders accurate comparisons between tools and precludes data sharing and the development of common downstream analysis methods.
To overcome this situation, we present here a community-based project, miRNA Transcriptomic Open Project (miRTOP) working towards the optimization of miRNA analyses. The aim of miRTOP is to promote the development of downstream isomiR analysis tools that are compatible with existing detection and quantification tools. Based on the existing GFF3 format, we first created a new standard format, mirGFF3, for the output of miRNA/isomiR detection and quantification results from small RNA-seq data. Additionally, we developed a command line Python tool, mirtop, to create and manage the mirGFF3 format. Currently, mirtop can convert into mirGFF3 the outputs of commonly used pipelines, such as seqbuster, isomiR-SEA, sRNAbench, Prost! as well as BAM files. Some tools have also incorporated the mirGFF3 format directly into their code, such as, miRge2.0, IsoMIRmap and OptimiR. Its open architecture enables any tool or pipeline to output or convert results into mirGFF3. Collectively, this isomiR categorization system, along with the accompanying mirGFF3 and mirtop API, provide a comprehensive solution for the standardization of miRNA and isomiR annotation, enabling data sharing, reporting, comparative analyses and benchmarking, while promoting the development of common miRNA methods focusing on downstream steps of miRNA detection, annotation and quantification.
https://github.com/miRTop/mirGFF3/ and https://github.com/miRTop/mirtop.
Supplementary data are available at Bioinformatics online.
微小 RNA(miRNA)是参与转录后基因调控的小 RNA 分子(约 22 个核苷酸长)。高通量测序技术的进步导致了 miRNA 序列变体的发现。虽然有许多 miRNA-seq 分析工具,但输出格式的多样性阻碍了工具之间的准确比较,也阻碍了数据共享和通用下游分析方法的开发。
为了克服这种情况,我们在此提出了一个基于社区的项目,即微小 RNA 转录组开放项目(miRTOP),致力于 miRNA 分析的优化。miRTOP 的目的是促进与现有检测和定量工具兼容的下游 isomiR 分析工具的开发。基于现有的 GFF3 格式,我们首先为 miRNA/isomiR 检测和定量结果从小 RNA-seq 数据中创建了一个新的标准格式,mirGFF3。此外,我们开发了一个命令行 Python 工具 mirtop,用于创建和管理 mirGFF3 格式。目前,mirtop 可以将常用管道(如 seqbuster、isomiR-SEA、sRNAbench、Prost!以及 BAM 文件)的输出转换为 mirGFF3。一些工具也直接将 mirGFF3 格式纳入其代码中,如 miRge2.0、IsoMIRmap 和 OptimiR。其开放架构允许任何工具或管道将结果输出或转换为 mirGFF3。总的来说,这种 isomiR 分类系统,以及随之而来的 mirGFF3 和 mirtop API,为 miRNA 和 isomiR 注释的标准化提供了一个全面的解决方案,实现了数据共享、报告、比较分析和基准测试,同时促进了以 miRNA 检测、注释和定量的下游步骤为重点的通用 miRNA 方法的发展。
https://github.com/miRTop/mirGFF3/ 和 https://github.com/miRTop/mirtop。
补充数据可在《生物信息学》在线获取。