Howards Hughes Medical Institute and the Department of Molecular and Cell Biology, University of California, Berkeley, California, USA.
Department of Biological Sciences, Vanderbilt University, Nashville, Tennessee, USA.
Microbiol Spectr. 2024 Apr 2;12(4):e0398023. doi: 10.1128/spectrum.03980-23. Epub 2024 Mar 6.
Modern taxonomic classification is often based on phylogenetic analyses of a few molecular markers, although single-gene studies are still common. Here, we leverage genome-scale molecular phylogenetics (phylogenomics) of species and populations to reconstruct evolutionary relationships in a dense data set of 710 fungal genomes from the biomedically and technologically important genus . To do so, we generated a novel set of 1,362 high-quality molecular markers specific for and provided profile Hidden Markov Models for each, facilitating their use by others. Examining the resulting phylogeny helped resolve ongoing taxonomic controversies, identified new ones, and revealed extensive strain misidentification (7.59% of strains were previously misidentified), underscoring the importance of population-level sampling in species classification. These findings were corroborated using the current standard, taxonomically informative loci. These findings suggest that phylogenomics of species and populations can facilitate accurate taxonomic classifications and reconstructions of the Tree of Life.IMPORTANCEIdentification of fungal species relies on the use of molecular markers. Advances in genomic technologies have made it possible to sequence the genome of any fungal strain, making it possible to use genomic data for the accurate assignment of strains to fungal species (and for the discovery of new ones). We examined the usefulness and current limitations of genomic data using a large data set of 710 publicly available genomes from multiple strains and species of the biomedically, agriculturally, and industrially important genus . Our evolutionary genomic analyses revealed that nearly 8% of publicly available genomes are misidentified. Our work highlights the usefulness of genomic data for fungal systematic biology and suggests that systematic genome sequencing of multiple strains, including reference strains (e.g., type strains), of fungal species will be required to reduce misidentification errors in public databases.
现代分类学的分类通常基于少数分子标记的系统发育分析,尽管单基因研究仍然很常见。在这里,我们利用物种和种群的基因组规模分子系统发育(系统基因组学)来重建生物医学和技术上重要的属中的 710 个真菌基因组的密集数据集的进化关系。为此,我们生成了一套新的 1362 个针对的高质量分子标记,并为每个标记提供了轮廓隐马尔可夫模型,方便其他人使用。检查由此产生的系统发育有助于解决正在进行的分类学争议,发现新的争议,并揭示了广泛的菌株错误鉴定(7.59%的菌株以前被错误鉴定),强调了在物种分类中进行种群水平采样的重要性。这些发现通过使用当前的标准,分类学上有意义的基因座得到了证实。这些发现表明,物种和种群的系统基因组学可以促进准确的分类学分类和生命之树的重建。
重要性 真菌物种的鉴定依赖于分子标记的使用。基因组技术的进步使得对任何真菌菌株进行基因组测序成为可能,这使得可以使用基因组数据准确地将菌株分配给真菌物种(并发现新的物种)。我们使用来自生物医学、农业和工业重要属的多个菌株和物种的 710 个公开可用基因组的大型数据集来检查基因组数据的有用性和当前局限性。我们的进化基因组分析表明,近 8%的公开可用的基因组被错误鉴定。我们的工作突出了基因组数据在真菌系统生物学中的有用性,并表明需要对真菌物种的多个菌株(包括参考菌株(例如,模式菌株))进行系统基因组测序,以减少公共数据库中的错误鉴定错误。