Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, Leipzig University, D-04107 Leipzig, Germany.
Max Planck Institute for Mathematics in the Sciences, D-04103 Leipzig, Germany.
Genes (Basel). 2021 Feb 27;12(3):348. doi: 10.3390/genes12030348.
Homology-based annotation of short RNAs, including microRNAs, is a difficult problem because their inherently small size limits the available information. Highly sensitive methods, including parameter optimized blast, nhmmer, or cmsearch runs designed to increase sensitivity inevitable lead to large numbers of false positives, which can be detected only by detailed analysis of specific features typical for a RNA family and/or the analysis of conservation patterns in structure-annotated multiple sequence alignments. The miRNAture pipeline implements a workflow specific to animal microRNAs that automatizes homology search and validation steps. The miRNAture pipeline yields very good results for a large number of "typical" miRBase families. However, it also highlights difficulties with atypical cases, in particular microRNAs deriving from repetitive elements and microRNAs with unusual, branched precursor structures and atypical locations of the mature product, which require specific curation by domain experts.
短 RNA(包括 microRNA)的同源性注释是一个难题,因为它们的固有小尺寸限制了可用信息。高度敏感的方法,包括参数优化的 blast、nhmmer 或 cmsearch 运行,旨在提高灵敏度,不可避免地导致大量的假阳性,这些假阳性只能通过详细分析特定 RNA 家族的典型特征和/或对结构注释的多序列比对中的保守模式进行分析来检测。miRNAture 管道实现了一个专门针对动物 microRNA 的工作流程,自动执行同源性搜索和验证步骤。miRNAture 管道为大量“典型”miRBase 家族产生了非常好的结果。然而,它也突出了非典型情况的困难,特别是源于重复元件的 microRNA 和具有不寻常的分支前体结构和成熟产物位置异常的 microRNA,这需要领域专家进行特定的策展。