Department of Genetics, Faculty of Biology, University of Bucharest, 060101 Bucharest, Romania.
Independent Researcher, 060101 Bucharest, Romania.
Int J Mol Sci. 2022 Oct 21;23(20):12686. doi: 10.3390/ijms232012686.
The annotation of transposable elements (transposons) is a very dynamic field of genomics and various tools assigned to support this bioinformatics endeavor have been developed and described. Genome ARTIST v1.19 (GA_v1.19) software was conceived for mapping artificial transposons mobilized during insertional mutagenesis projects, but the new functions of GA_v2 qualify it as a tool for the mapping and annotation of natural transposons (NTs) in long reads, contigs and assembled genomes. The tabular export of mapping and annotation data for high-throughput data analysis, the generation of a list of flanking sequences around the coordinates of insertion or around the target site duplications and the computing of a consensus sequence for the flanking sequences are all key assets of GA_v2. Additionally, we developed a set of scripts that enable the user to annotate NTs, to harness annotations offered by FlyBase for genome, to convert sequence files from .fasta to .raw, and to extract junction query sequences essential for NTs mapping. Herein, we present the applicability of GA_v2 for a preliminary annotation of P-element and hobo class II NTs and copia retrotransposon in the genome of strain Horezu_LaPeri (Horezu), Romania, which was sequenced with Nanopore technology in our laboratory. We used contigs assembled with Flye tool and a Q10 quality filter of the reads. Our results suggest that GA_v2 is a reliable autonomous tool able to perform mapping and annotation of NTs in genomes sequenced by long sequencing technology. GA_v2 is open-source software compatible with Linux, Mac OS and Windows and is available at GitHub repository and dedicated website.
转座元件(转座子)的注释是基因组学中一个非常活跃的领域,已经开发并描述了各种工具来支持这一生物信息学工作。GenomeARTIST v1.19(GA_v1.19)软件是为了对插入诱变项目中移动的人工转座子进行作图而设计的,但 GA_v2 的新功能使其成为一种用于在长读段、重叠群和组装基因组中对自然转座子(NTs)进行作图和注释的工具。用于高通量数据分析的映射和注释数据的表格导出、围绕插入或靶位点重复的坐标生成侧翼序列列表以及计算侧翼序列的共识序列,这些都是 GA_v2 的关键资产。此外,我们还开发了一组脚本,使用户能够注释 NTs,利用 FlyBase 提供的基因组注释,将.fasta 格式的序列文件转换为.raw 格式,并提取用于 NTs 作图的接头查询序列。在此,我们展示了 GA_v2 在罗马尼亚 Horezu 实验室用 Nanopore 技术测序的 Horezu_LaPeri(Horezu)菌株基因组中对 P 元件和 hobo 类 II NTs 和 copia 反转录转座子进行初步注释的适用性。我们使用 Flye 工具组装的重叠群和读取的 Q10 质量过滤器。我们的结果表明,GA_v2 是一种可靠的自主工具,能够在长测序技术测序的基因组中执行 NTs 的作图和注释。GA_v2 是一个开源软件,与 Linux、Mac OS 和 Windows 兼容,并可在 GitHub 存储库和专用网站上获得。