Suppr超能文献

用于转座子从头检测的软件评估

Software evaluation for de novo detection of transposons.

作者信息

Rodriguez Matias, Makałowski Wojciech

机构信息

Institute of Bioinformatics, Faculty of Medicine, University of Münster, 48149, Münster, Germany.

出版信息

Mob DNA. 2022 Apr 27;13(1):14. doi: 10.1186/s13100-022-00266-2.

Abstract

Transposable elements (TEs) are major genomic components in most eukaryotic genomes and play an important role in genome evolution. However, despite their relevance the identification of TEs is not an easy task and a number of tools were developed to tackle this problem. To better understand how they perform, we tested several widely used tools for de novo TE detection and compared their performance on both simulated data and well curated genomic sequences. As expected, tools that build TE-models performed better than k-mer counting ones, with RepeatModeler beating competitors in most datasets. However, there is a tendency for most tools to identify TE-regions in a fragmented manner and it is also frequent that small TEs or fragmented TEs are not detected. Consequently, the identification of TEs is still a challenging endeavor and it requires a significant manual curation by an experienced expert. The results will be helpful for identifying common issues associated with TE-annotation and for evaluating how comparable are the results obtained with different tools.

摘要

转座元件(TEs)是大多数真核生物基因组中的主要基因组成分,在基因组进化中发挥着重要作用。然而,尽管它们具有相关性,但转座元件的识别并非易事,为此人们开发了许多工具来解决这一问题。为了更好地了解它们的性能,我们测试了几种广泛使用的从头检测转座元件的工具,并在模拟数据和精心整理的基因组序列上比较了它们的性能。正如预期的那样,构建转座元件模型的工具比基于k-mer计数的工具表现更好,在大多数数据集中RepeatModeler击败了其他竞争对手。然而,大多数工具倾向于以碎片化的方式识别转座元件区域,小的转座元件或碎片化的转座元件也经常未被检测到。因此,转座元件的识别仍然是一项具有挑战性的工作,需要经验丰富的专家进行大量的人工整理。这些结果将有助于识别与转座元件注释相关的常见问题,并评估不同工具获得的结果的可比性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/446d/9047281/3c899f6f69f0/13100_2022_266_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验