Suppr超能文献

FastANI、Mash 和 Dashing 均可区分物种。

FastANI, Mash and Dashing equally differentiate between species.

机构信息

Department of Biology, Wilfrid Laurier University, Waterloo, Ontario, Canada.

出版信息

PeerJ. 2022 Jul 21;10:e13784. doi: 10.7717/peerj.13784. eCollection 2022.

Abstract

Bacteria of the genus are among the most important multi-drug resistant human pathogens, though they have been isolated from a variety of environments. The importance and ubiquity of these organisms call for quick and accurate methods for their classification. Average Nucleotide Identity (ANI) is becoming a standard for species delimitation based on whole genome sequence comparison. However, much faster genome comparison tools have been appearing in the literature. In this study we tested the quality of different approaches for genome-based species delineation against ANI. To this end, we compared 1,189 genomes using measures calculated with Mash, Dashing, and DNA compositional signatures, all of which run in a fraction of the time required to obtain ANI. Receiver Operating Characteristic (ROC) curve analyses showed equal quality in species discrimination for ANI, Mash and Dashing, with Area Under the Curve (AUC) values above 0.99, followed by DNA signatures (AUC: 0.96). Accordingly, groups obtained at optimized cutoffs largely agree with species designation, with ANI, Mash and Dashing producing 15 species-level groups. DNA signatures broke the dataset into more than 30 groups. Testing Mash to map species after adding draft genomes to the dataset also showed excellent results (AUC above 0.99), producing a total of 26 species-level groups. The ecological niches of strains were found to neither be related to species delimitation, nor to protein functional content, suggesting that a single species can have a wide repertoire of ecological functions.

摘要

属细菌是最重要的多药耐药人类病原体之一,尽管它们已从各种环境中分离出来。这些生物的重要性和普遍性要求对其进行快速和准确的分类。平均核苷酸同一性(ANI)正成为基于全基因组序列比较的物种划分标准。然而,在文献中已经出现了许多更快的基因组比较工具。在这项研究中,我们测试了基于基因组的物种划分的不同方法与 ANI 相比的质量。为此,我们使用 Mash、Dashing 和 DNA 组成特征计算的指标比较了 1189 个基因组,所有这些指标的运行时间都不到获得 ANI 的一小部分。受试者工作特征(ROC)曲线分析表明,ANI、Mash 和 Dashing 在物种区分方面的质量相同,曲线下面积(AUC)值均高于 0.99,其次是 DNA 特征(AUC:0.96)。因此,在优化的截止值下获得的组与物种指定基本一致,ANI、Mash 和 Dashing 产生了 15 个种级分组。DNA 特征将数据集分为 30 多个组。在向数据集添加草案基因组后,测试 Mash 来映射物种也显示出了极好的结果(AUC 高于 0.99),总共产生了 26 个种级分组。菌株的生态位既与物种划分无关,也与蛋白质功能含量无关,这表明一个单一的物种可以拥有广泛的生态功能。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dc72/9308963/2573a38f1121/peerj-10-13784-g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验