Suppr超能文献

DANTE和DANTE_LTR:用于植物基因组中长末端重复逆转录转座子的以谱系为中心的注释管道。

DANTE and DANTE_LTR: lineage-centric annotation pipelines for long terminal repeat retrotransposons in plant genomes.

作者信息

Novák Petr, Hoštáková Nina, Neumann Pavel, Macas Jiří

机构信息

Biology Centre, Czech Academy of Sciences, Branišovská 31, České Budějovice, Czech Republic.

出版信息

NAR Genom Bioinform. 2024 Aug 29;6(3):lqae113. doi: 10.1093/nargab/lqae113. eCollection 2024 Sep.

Abstract

Long terminal repeat (LTR) retrotransposons constitute a predominant class of repetitive DNA elements in most plant genomes. With the increasing number of sequenced plant genomes, there is an ongoing demand for computational tools facilitating efficient annotation and classification of LTR retrotransposons in plant genome assemblies. Herein, we introduce DANTE, a computational pipeline for Domain-based ANnotation of Transposable Elements, designed for sensitive detection of these elements via their conserved protein domain sequences. The identified protein domains are subsequently inputted into the DANTE_LTR pipeline to annotate complete element sequences by detecting their structural features, such as LTRs, in adjacent genomic regions. Leveraging domain sequences allows for precise classification of elements into phylogenetic lineages, offering a more granular annotation compared with coarser conventional superfamily-based classification methods. The efficiency and accuracy of this approach were evidenced via annotation of LTR retrotransposons in 93 plant genomes. Results were benchmarked against several established pipelines, showing that DANTE_LTR is capable of identifying significantly more intact LTR retrotransposons. DANTE and DANTE_LTR are provided as user-friendly Galaxy tools accessible via a public server (https://repeatexplorer-elixir.cerit-sc.cz), installable on local Galaxy instances from the Galaxy tool shed or executable from the command line.

摘要

长末端重复序列(LTR)逆转座子是大多数植物基因组中主要的重复DNA元件类型。随着测序植物基因组数量的增加,对有助于在植物基因组组装中高效注释和分类LTR逆转座子的计算工具的需求也在不断增长。在此,我们介绍了DANTE,这是一种基于结构域的转座元件注释计算流程,旨在通过其保守的蛋白质结构域序列灵敏地检测这些元件。随后,将鉴定出的蛋白质结构域输入到DANTE_LTR流程中,通过检测相邻基因组区域中的结构特征(如LTR)来注释完整的元件序列。利用结构域序列能够将元件精确分类到系统发育谱系中,与基于传统超家族的更粗略分类方法相比,提供了更精细的注释。通过对93个植物基因组中的LTR逆转座子进行注释,证明了该方法的效率和准确性。结果与几个已建立的流程进行了基准比较,表明DANTE_LTR能够识别出显著更多的完整LTR逆转座子。DANTE和DANTE_LTR作为用户友好的Galaxy工具提供,可通过公共服务器(https://repeatexplorer-elixir.cerit-sc.cz)访问,可从Galaxy工具库安装到本地Galaxy实例上,也可从命令行执行。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/09a5/11358816/6ea8b8ad4415/lqae113fig1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验