Dias Yago José Mariz, Dezordi Filipe Zimmer, Wallau Gabriel da Luz
Núcleo de Bioinformática, Instituto Aggeu Magalhães (IAM), Fundação Oswaldo Cruz (FIOCRUZ), Recife, PE, Brazil.
Departamento de Entomologia, Instituto Aggeu Magalhães (IAM), Fundação Oswaldo Cruz (FIOCRUZ), Recife, PE, Brazil.
Comput Struct Biotechnol J. 2024 Oct 18;23:3662-3668. doi: 10.1016/j.csbj.2024.10.012. eCollection 2024 Dec.
Horizontal gene transfer is a phenomenon of genetic material transmission between species with no parental relationship. It has been characterized among several major branches of life, including among prokaryotes, viruses and eukaryotes. The characterization of endogenous elements derived from viruses or bacteria provides a snapshot of past host-pathogen interactions and coevolution as well as reference information to remove false positive results from metagenomic studies. Currently there is a lack of general purpose standardized tools for endogenous elements screening which limits reproducibility and hinder comparative analysis between studies. Here we describe EEfinder, a new general purpose tool for identification and classification of endogenous elements derived from viruses or bacteria found in eukaryotic genomes. The tool was developed to include six common steps performed in this type of analysis: data cleaning, similarity search through sequence alignment, filtering candidate elements, taxonomy assignment, merging of truncated elements and flanks extraction. We evaluated the sensitivity of EEfinder to identify endogenous elements through comparative analysis using data from the literature and showed that EEfinder automatically detected 97 % of the EVEs compared to published results obtained by manual curation and detected an almost exact full integration of a genome described using wet-lab experiments. Therefore, EEfinder can effectively and systematically identify endogenous elements with bacterial/viral origin integrated in eukaryotic genomes. EEfinder is publicly available on https://github.com/WallauBioinfo/EEfinder.
水平基因转移是一种在没有亲子关系的物种之间进行遗传物质传递的现象。它已在生命的几个主要分支中得到表征,包括原核生物、病毒和真核生物。对源自病毒或细菌的内源性元件的表征提供了过去宿主 - 病原体相互作用和共同进化的快照,以及从宏基因组学研究中去除假阳性结果的参考信息。目前缺乏用于内源性元件筛选的通用标准化工具,这限制了可重复性并阻碍了研究之间的比较分析。在这里,我们描述了EEfinder,这是一种用于识别和分类在真核基因组中发现的源自病毒或细菌的内源性元件的新通用工具。该工具的开发包括此类分析中执行的六个常见步骤:数据清理、通过序列比对进行相似性搜索、筛选候选元件、分类学分配、截断元件合并和侧翼提取。我们使用文献数据通过比较分析评估了EEfinder识别内源性元件的敏感性,结果表明,与通过人工筛选获得的已发表结果相比,EEfinder自动检测到了97%的内源性病毒元件(EVE),并且检测到了一个几乎完全整合的、使用湿实验室实验描述的基因组。因此,EEfinder可以有效且系统地识别整合在真核基因组中的具有细菌/病毒起源的内源性元件。EEfinder可在https://github.com/WallauBioinfo/EEfinder上公开获取。