Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, D-04107 Leipzig, Germany.
Doctoral School of Science and Technology, Center for Biotechnology Research, Lebanese University, Hadath Campus, Beirut, Lebanon.
Bioinformatics. 2019 Nov 1;35(22):4553-4559. doi: 10.1093/bioinformatics/btz271.
MicroRNAs form an important class of RNA regulators that has been studied extensively. The miRBase and Rfam database provide rich, frequently updated information on both pre-miRNAs and their mature forms. These data sources, however, rely on individual data submission and thus are neither complete nor consistent in their coverage across different miRNA families. Quantitative studies of miRNA evolution therefore are difficult or impossible on this basis.
We present here a workflow and a corresponding implementation, MIRfix, that automatically curates miRNA datasets by improving alignments of their precursors, the consistency of the annotation of mature miR and miR* sequence, and the phylogenetic coverage. MIRfix produces alignments that are comparable across families and sets the stage for improved homology search as well as quantitative analyses.
MIRfix can be downloaded from https://github.com/Bierinformatik/MIRfix.
Supplementary data are available at Bioinformatics online.
MicroRNAs 是一类重要的 RNA 调节剂,已经得到了广泛的研究。miRBase 和 Rfam 数据库提供了丰富的、经常更新的关于前体 miRNA 和它们的成熟形式的信息。然而,这些数据源依赖于单个数据提交,因此在不同 miRNA 家族的覆盖范围上既不完整也不一致。因此,基于这些数据基础,miRNA 进化的定量研究是困难的或不可能的。
我们在这里提出了一个工作流程和一个相应的实现,MIRfix,它通过改进前体的比对、成熟 miR 和 miR*序列注释的一致性以及系统发育覆盖范围,自动整理 miRNA 数据集。MIRfix 生成的比对在家族之间是可比的,并为改进同源搜索以及定量分析奠定了基础。
MIRfix 可以从 https://github.com/Bierinformatik/MIRfix 下载。
补充数据可在 Bioinformatics 在线获得。