Tello Daniel, Gonzalez-Garcia Laura Natalia, Gomez Jorge, Zuluaga-Monares Juan Camilo, Garcia Rogelio, Angel Ricardo, Mahecha Daniel, Duarte Erick, Leon Maria Del Rosario, Reyes Fernando, Escobar-Velásquez Camilo, Linares-Vásquez Mario, Cardozo Nicolas, Duitama Jorge
Systems and Computing Engineering Department, Universidad de los Andes, Bogotá, Colombia.
Mol Ecol Resour. 2023 Apr;23(3):712-724. doi: 10.1111/1755-0998.13737. Epub 2022 Nov 27.
Whole-genome alignment allows researchers to understand the genomic structure and variation among genomes. Approaches based on direct pairwise comparisons of DNA sequences require large computational capacities. As a consequence, pipelines combining tools for orthologous gene identification and synteny have been developed. In this manuscript, we present the latest functionalities implemented in NGSEP 4, to identify orthogroups and perform whole genome alignments. NGSEP implements functionalities for identification of clusters of homologus genes, synteny analysis and whole genome alignment. Our results showed that the NGSEP algorithm for orthogroups identification has competitive accuracy and efficiency in comparison to commonly used tools. The implementation also includes a visualization of the whole genome alignment based on synteny of the orthogroups that were identified, and a reconstruction of the pangenome based on frequencies of the orthogroups among the genomes. NGSEP 4 also includes a new graphical user interface based on the JavaFX technology. We expect that these new developments will be very useful for several studies in evolutionary biology and population genomics.
全基因组比对使研究人员能够了解基因组之间的基因组结构和变异。基于DNA序列直接成对比较的方法需要强大的计算能力。因此,已经开发出了结合直系同源基因鉴定和共线性工具的流程。在本手稿中,我们展示了NGSEP 4中实现的最新功能,用于识别直系同源组并进行全基因组比对。NGSEP实现了用于识别同源基因簇、共线性分析和全基因组比对的功能。我们的结果表明,与常用工具相比,NGSEP用于直系同源组识别的算法具有具有竞争力的准确性和效率。该实现还包括基于已识别直系同源组的共线性对全基因组比对进行可视化,以及基于基因组中直系同源组的频率重建泛基因组。NGSEP 4还包括一个基于JavaFX技术的新图形用户界面。我们预计这些新进展将对进化生物学和群体基因组学的多项研究非常有用。