Kronmiller Brent A, Wise Roger P
Department of Plant Pathology and Microbiology, Iowa State University, Ames, IA, USA.
Methods Mol Biol. 2013;1057:305-19. doi: 10.1007/978-1-62703-568-2_22.
Grass genomes harbor a diverse and complex content of repeated sequences. Most of these repeats occur as abundant transposable elements (TEs), which present unique challenges to sequence, assemble, and annotate genomes. Multiple copies of Long Terminal Repeat (LTR) retrotransposons can hinder sequence assembly and also cause problems with gene annotation. TEs can also contain protein-encoding genes, the ancient remnants of which can mislead gene identification software if not correctly masked. Hence, accurate assembly is crucial for gene annotation. We present TEnest v2.0. TEnest computationally annotates and chronologically displays nested transposable elements. Utilizing organism-specific TE databases as a reference for reconstructing degraded TEs to their ancestral state, annotation of repeats is accomplished by iterative sequence alignment. Subsequently, an output consisting of a graphical display of the chronological nesting structure and coordinate positions of all TE insertions is the result. Both linux command line and Web versions of the TEnest software are available at www.wiselab.org and www.plantgdb.org/tool/, respectively.
禾本科植物基因组含有多样且复杂的重复序列。这些重复序列大多以丰富的转座元件(TEs)形式存在,这给基因组测序、组装和注释带来了独特挑战。长末端重复序列(LTR)逆转座子的多个拷贝会阻碍序列组装,还会导致基因注释问题。TEs 也可能包含蛋白质编码基因,如果没有正确屏蔽,其古老残余部分会误导基因识别软件。因此,准确组装对于基因注释至关重要。我们展示了 TEnest v2.0。TEnest 通过计算注释并按时间顺序显示嵌套的转座元件。利用特定物种的 TE 数据库作为参考,将退化的 TEs 重建到其祖先状态,通过迭代序列比对完成重复序列的注释。随后,输出结果是一个图形化显示所有 TE 插入的时间顺序嵌套结构和坐标位置。TEnest 软件的 Linux 命令行版本和网络版本分别可在 www.wiselab.org 和 www.plantgdb.org/tool/ 获取。