Evers Maurits, Huttner Michael, Dueck Anne, Meister Gunter, Engelmann Julia C
Institute of Functional Genomics, University of Regensburg, Regensburg, Germany.
Biochemistry Center Regensburg (BZR), Laboratory for RNA Biology, University of Regensburg, Regensburg, Germany.
BMC Bioinformatics. 2015 Nov 5;16:370. doi: 10.1186/s12859-015-0798-3.
MicroRNAs (miRNAs) are short regulatory RNAs derived from longer precursor RNAs. miRNA biogenesis has been studied in animals and plants, recently elucidating more complex aspects, such as non-conserved, species-specific, and heterogeneous miRNA precursor populations. Small RNA sequencing data can help in computationally identifying genomic loci of miRNA precursors. The challenge is to predict a valid miRNA precursor from inhomogeneous read coverage from a complex RNA library: while the mature miRNA typically produces many sequence reads, the remaining part of the precursor is covered very sparsely. As recent results suggest, alternative miRNA biogenesis pathways may lead to a more diverse miRNA precursor population than previously assumed. In plants, the latter manifests itself in e.g. complex secondary structures and expression from multiple loci within precursors. Current miRNA identification algorithms often depend on already existing gene annotation, and/or make use of specific miRNA precursor features such as precursor lengths, secondary structures etc. Consequently and in view of the emerging new understanding of a more complex miRNA biogenesis in plants, current tools may fail to characterise organism-specific and heterogeneous miRNA populations.
miRA is a new tool to identify miRNA precursors in plants, allowing for heterogeneous and complex precursor populations. miRA requires small RNA sequencing data and a corresponding reference genome, and evaluates precursor secondary structures and precursor processing accuracy; key parameters can be adapted based on the specific organism under investigation. We show that miRA outperforms the currently best plant miRNA prediction tools both in sensitivity and specificity, for data involving Arabidopsis thaliana and the Volvocine algae Chlamydomonas reinhardtii; the latter organism has been shown to exhibit a heterogeneous and complex precursor population with little cross-species miRNA sequence conservation, and therefore constitutes an ideal model organism. Furthermore we identify novel miRNAs in the Chlamydomonas-related organism Volvox carteri.
We propose miRA, a new plant miRNA identification tool that is well adapted to complex precursor populations. miRA is particularly suited for organisms with no existing miRNA annotation, or without a known related organism with well characterized miRNAs. Moreover, miRA has proven its ability to identify species-specific miRNAs. miRA is flexible in its parameter settings, and produces user-friendly output files in various formats (pdf, csv, genome-browser-suitable annotation files, etc.). It is freely available at https://github.com/mhuttner/miRA.
微小RNA(miRNA)是由较长的前体RNA衍生而来的短调节性RNA。miRNA的生物合成已在动物和植物中进行了研究,最近揭示了更复杂的方面,例如非保守的、物种特异性的和异质的miRNA前体群体。小RNA测序数据有助于通过计算识别miRNA前体的基因组位点。挑战在于从复杂RNA文库中不均匀的读数覆盖中预测有效的miRNA前体:虽然成熟的miRNA通常会产生许多序列读数,但前体的其余部分覆盖非常稀疏。正如最近的结果所示,替代性的miRNA生物合成途径可能导致比以前假设的更多样化的miRNA前体群体。在植物中,后者表现为例如复杂的二级结构和前体内多个位点的表达。当前的miRNA识别算法通常依赖于现有的基因注释,和/或利用特定的miRNA前体特征,如前体长度、二级结构等。因此,鉴于对植物中更复杂的miRNA生物合成有了新的认识,当前的工具可能无法表征生物体特异性和异质的miRNA群体。
miRA是一种用于识别植物中miRNA前体的新工具,适用于异质和复杂的前体群体。miRA需要小RNA测序数据和相应的参考基因组,并评估前体二级结构和前体加工准确性;关键参数可以根据所研究的特定生物体进行调整。我们表明,对于涉及拟南芥和绿藻莱茵衣藻的数据,miRA在敏感性和特异性方面均优于目前最好的植物miRNA预测工具;后一种生物体已被证明表现出异质和复杂的前体群体,几乎没有跨物种miRNA序列保守性,因此构成了理想的模式生物。此外,我们在与衣藻相关的生物体团藻中鉴定出了新的miRNA。
我们提出了miRA,一种新的植物miRNA识别工具,非常适合复杂的前体群体。miRA特别适用于没有现有miRNA注释的生物体,或没有已知的具有特征明确的miRNA的相关生物体。此外,miRA已证明其识别物种特异性miRNA的能力。miRA在参数设置方面很灵活,并以各种格式(pdf、csv、适合基因组浏览器的注释文件等)生成用户友好的输出文件。它可在https://github.com/mhuttner/miRA上免费获得。