Seppey Mickaël, Iglhaut Clara, Gil Manuel, Anisimova Maria
Institute of Computational Life Science, Zürich University of Applied Sciences, Wädenswil, Switzerland.
Swiss Institute of Bioinformatics, Lausanne, Switzerland.
Genome Biol Evol. 2025 May 30;17(6). doi: 10.1093/gbe/evaf119.
Insertions and deletions (indels) play a critical role in the evolutionary dynamics of genomes, yet their accurate detection and interpretation in phylogenetic studies remain challenging. Our study investigates the influence of different multiple sequence alignment (MSA) and ancestral sequence reconstruction (ASR) tools on indel pattern reconstruction, focusing on HIV-1 subtype B. We aim to understand how methodological choices affect the detection of indels, thereby emphasizing the importance of selecting appropriate tools for evolutionary analyses to improve phylogenetic accuracy. We conducted a comparative analysis using five MSA tools (MAFFT, PRANK+F, IndelMaP, ProPIP, and Historian) and five ASR tools (GRASP, FastML, IndelMaP, ARPIP, and Historian). By examining inferred indel events across all tool combinations, we evaluated their rates, lengths, and positions within the genome, specifically analyzing the env gene and its V1 variable loop. Even though each method tested was able to reconstruct known variable regions in the env gene, our results highlight that the choice of MSA tool significantly impacts indel conservation and interpretation, more so than the choice of ASR tool. This finding underscores the necessity of context-specific MSA tool selection in phylogenetic studies and provides crucial insights for improving the accuracy of indel detection and evolutionary inferences in phylogenetic studies of HIV-1 and other genomes.
插入和缺失(indels)在基因组的进化动态中起着关键作用,然而在系统发育研究中对其进行准确检测和解释仍然具有挑战性。我们的研究调查了不同的多序列比对(MSA)和祖先序列重建(ASR)工具对插入缺失模式重建的影响,重点关注HIV-1 B亚型。我们旨在了解方法选择如何影响插入缺失的检测,从而强调为进化分析选择合适工具以提高系统发育准确性的重要性。我们使用五种MSA工具(MAFFT、PRANK+F、IndelMaP、ProPIP和Historian)和五种ASR工具(GRASP、FastML、IndelMaP、ARPIP和Historian)进行了比较分析。通过检查所有工具组合推断出的插入缺失事件,我们评估了它们在基因组中的发生率、长度和位置,特别分析了env基因及其V1可变环。尽管所测试的每种方法都能够重建env基因中的已知可变区域,但我们的结果表明,MSA工具的选择对插入缺失的保守性和解释有显著影响,比ASR工具的选择影响更大。这一发现强调了在系统发育研究中根据具体情况选择MSA工具的必要性,并为提高HIV-1和其他基因组系统发育研究中插入缺失检测和进化推断的准确性提供了关键见解。