Gebert Daniel, Hewel Charlotte, Rosenkranz David
Institute of Organismic and Molecular Evolutionary Biology, Anthropology, Johannes Gutenberg University, 55099, Mainz, Germany.
BMC Genomics. 2017 Aug 22;18(1):644. doi: 10.1186/s12864-017-4031-9.
Next generation sequencing is a key technique in small RNA biology research that has led to the discovery of functionally different classes of small non-coding RNAs in the past years. However, reliable annotation of the extensive amounts of small non-coding RNA data produced by high-throughput sequencing is time-consuming and requires robust bioinformatics expertise. Moreover, existing tools have a number of shortcomings including a lack of sensitivity under certain conditions, limited number of supported species or detectable sub-classes of small RNAs.
Here we introduce unitas, an out-of-the-box ready software for complete annotation of small RNA sequence datasets, supporting the wide range of species for which non-coding RNA reference sequences are available in the Ensembl databases (currently more than 800). unitas combines high quality annotation and numerous analysis features in a user-friendly manner. A complete annotation can be started with one simple shell command, making unitas particularly useful for researchers not having access to a bioinformatics facility. Noteworthy, the algorithms implemented in unitas are on par or even outperform comparable existing tools for small RNA annotation that map to publicly available ncRNA databases.
unitas brings together annotation and analysis features that hitherto required the installation of numerous different bioinformatics tools which can pose a challenge for the non-expert user. With this, unitas overcomes the problem of read normalization. Moreover, the high quality of sequence annotation and analysis, paired with the ease of use, make unitas a valuable tool for researchers in all fields connected to small RNA biology.
新一代测序是小RNA生物学研究中的一项关键技术,在过去几年中促使了功能各异的小非编码RNA类别的发现。然而,对高通量测序产生的大量小非编码RNA数据进行可靠注释既耗时又需要强大的生物信息学专业知识。此外,现有工具存在许多缺点,包括在某些条件下缺乏敏感性、支持的物种数量有限或可检测的小RNA亚类数量有限。
在此我们介绍unitas,这是一款开箱即用的软件,用于对小RNA序列数据集进行完整注释,支持Ensembl数据库中可获取非编码RNA参考序列的广泛物种(目前超过800种)。unitas以用户友好的方式结合了高质量注释和众多分析功能。通过一个简单的 shell 命令即可开始完整注释,这使得unitas对于无法使用生物信息学设施的研究人员特别有用。值得注意的是,unitas中实现的算法与映射到公开可用ncRNA数据库的小RNA注释的现有同类工具相当,甚至更胜一筹。
unitas整合了注释和分析功能,而这些功能此前需要安装众多不同的生物信息学工具,这对非专业用户来说可能是个挑战。由此,unitas克服了 reads 标准化的问题。此外,高质量的序列注释和分析,再加上易用性,使unitas成为与小RNA生物学相关的所有领域研究人员的宝贵工具。