Suppr超能文献

对细菌双链DNA病毒基因图谱的探索揭示了广泛镶嵌现象中的ANI差距。

Exploration of the genetic landscape of bacterial dsDNA viruses reveals an ANI gap amid extensive mosaicism.

作者信息

Ndovie Wanangwa, Havránek Jan, Leconte Jade, Koszucki Janusz, Chindelevitch Leonid, Adriaenssens Evelien M, Mostowy Rafal J

机构信息

Malopolska Centre of Biotechnology, Jagiellonian University, Kraków, Poland.

Doctoral School of Exact and Natural Sciences, Jagiellonian University, Kraków, Poland.

出版信息

mSystems. 2025 Feb 18;10(2):e0166124. doi: 10.1128/msystems.01661-24. Epub 2025 Jan 29.

Abstract

Average nucleotide identity (ANI) is a widely used metric to estimate genetic relatedness, especially in microbial species delineation. While ANI calculation has been well optimized for bacteria and closely related viral genomes, accurate estimation of ANI below 80%, particularly in large reference data sets, has been challenging due to a lack of accurate and scalable methods. To bridge this gap, we introduce MANIAC, an efficient computational pipeline optimized for estimating ANI and alignment fraction (AF) in viral genomes with divergence around ANI of 70%. Using a rigorous simulation framework, we demonstrate MANIAC's accuracy and scalability compared to existing approaches, even to data sets of hundreds of thousands of viral genomes. Applying MANIAC to a curated data set of complete bacterial dsDNA viruses revealed a multimodal ANI distribution, with a distinct gap around 80%, akin to the bacterial ANI gap (~90%) but shifted, likely due to viral-specific evolutionary processes such as recombination dynamics and mosaicism. We then evaluated ANI and AF as predictors of genus-level taxonomy using a logistic regression model. We found that this model has strong predictive power (PR-AUC = 0.981), but that it works much better for virulent (PR-AUC = 0.997) than temperate (PR-AUC = 0.847) bacterial viruses. This highlights the complexity of taxonomic classification in temperate phages, known for their extensive mosaicism, and cautions against over-reliance on ANI in such cases. MANIAC can be accessed at https://github.com/bioinf-mcb/MANIAC.IMPORTANCEWe introduce a novel computational pipeline called MANIAC, designed to accurately assess average nucleotide identity (ANI) and alignment fraction (AF) between diverse viral genomes, scalable to data sets of over 100k genomes. Using computer simulations and real data analyses, we show that MANIAC could accurately estimate genetic relatedness between pairs of viral genomes of around 60%-70% ANI. We applied MANIAC to investigate the question of ANI discontinuity in bacterial dsDNA viruses, finding evidence for an ANI gap, akin to the one seen in bacteria but around ANI of 80%. We then assessed the ability of ANI and AF to predict taxonomic genus boundaries, finding its strong predictive power in virulent, but not in temperate phages. Our results suggest that bacterial dsDNA viruses may exhibit an ANI threshold (on average around 80%) above which recombination helps maintain population cohesiveness, as previously argued in bacteria.

摘要

平均核苷酸同一性(ANI)是一种广泛用于估计遗传相关性的指标,尤其在微生物物种划分中。虽然ANI计算已针对细菌和密切相关的病毒基因组进行了很好的优化,但由于缺乏准确且可扩展的方法,准确估计低于80%的ANI,特别是在大型参考数据集中,一直具有挑战性。为了弥补这一差距,我们引入了MANIAC,这是一种高效的计算流程,针对估计ANI约为70%的病毒基因组中的ANI和比对分数(AF)进行了优化。使用严格的模拟框架,我们展示了MANIAC与现有方法相比的准确性和可扩展性,甚至对于数十万个病毒基因组的数据集也是如此。将MANIAC应用于精心策划的完整细菌双链DNA病毒数据集,揭示了一种多峰ANI分布,在80%左右有明显的差距,类似于细菌的ANI差距(约90%),但有所偏移,这可能是由于病毒特异性的进化过程,如重组动态和镶嵌性。然后,我们使用逻辑回归模型评估ANI和AF作为属级分类学预测指标的能力。我们发现该模型具有很强的预测能力(PR-AUC = 0.981),但对于烈性细菌病毒(PR-AUC = 0.997)的效果比对温和细菌病毒(PR-AUC = 0.847)好得多。这突出了温和噬菌体分类学分类的复杂性,温和噬菌体以其广泛的镶嵌性而闻名,并警示在这种情况下不要过度依赖ANI。可在https://github.com/bioinf-mcb/MANIAC访问MANIAC。

重要性

我们引入了一种名为MANIAC的新型计算流程,旨在准确评估不同病毒基因组之间的平均核苷酸同一性(ANI)和比对分数(AF),可扩展到超过10万个基因组的数据集。通过计算机模拟和实际数据分析,我们表明MANIAC可以准确估计ANI约为60%-70%的病毒基因组对之间的遗传相关性。我们应用MANIAC来研究细菌双链DNA病毒中ANI不连续性的问题,发现了一个ANI差距的证据,类似于在细菌中看到的,但在ANI约为80%左右。然后,我们评估了ANI和AF预测分类属边界的能力,发现其在烈性噬菌体中有很强的预测能力,但在温和噬菌体中则不然。我们的结果表明,细菌双链DNA病毒可能表现出一个ANI阈值(平均约为80%),高于该阈值重组有助于维持种群凝聚力,正如之前在细菌中所论证的那样。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2edd/11834439/c57aa3e9504e/msystems.01661-24.f001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验