Suppr超能文献

对土壤微生物群落宏基因组分类器的深入评估。

An in-depth evaluation of metagenomic classifiers for soil microbiomes.

作者信息

Edwin Niranjana Rose, Fitzpatrick Amy Heather, Brennan Fiona, Abram Florence, O'Sullivan Orla

机构信息

Teagasc, Moorepark Food Research Centre, Moorepark, Fermoy, Cork, Ireland.

Functional Environmental Microbiology, School of Biological and Chemical Sciences, Ryan Institute, University of Galway, Galway, Ireland.

出版信息

Environ Microbiome. 2024 Mar 28;19(1):19. doi: 10.1186/s40793-024-00561-w.

Abstract

BACKGROUND

Recent endeavours in metagenomics, exemplified by projects such as the human microbiome project and TARA Oceans, have illuminated the complexities of microbial biomes. A robust bioinformatic pipeline and meticulous evaluation of their methodology have contributed to the success of these projects. The soil environment, however, with its unique challenges, requires a specialized methodological exploration to maximize microbial insights. A notable limitation in soil microbiome studies is the dearth of soil-specific reference databases available to classifiers that emulate the complexity of soil communities. There is also a lack of in-vitro mock communities derived from soil strains that can be assessed for taxonomic classification accuracy.

RESULTS

In this study, we generated a custom in-silico mock community containing microbial genomes commonly observed in the soil microbiome. Using this mock community, we simulated shotgun sequencing data to evaluate the performance of three leading metagenomic classifiers: Kraken2 (supplemented with Bracken, using a custom database derived from GTDB-TK genomes along with its own default database), Kaiju, and MetaPhlAn, utilizing their respective default databases for a robust analysis. Our results highlight the importance of optimizing taxonomic classification parameters, database selection, as well as analysing trimmed reads and contigs. Our study showed that classifiers tailored to the specific taxa present in our samples led to fewer errors compared to broader databases including microbial eukaryotes, protozoa, or human genomes, highlighting the effectiveness of targeted taxonomic classification. Notably, an optimal classifier performance was achieved when applying a relative abundance threshold of 0.001% or 0.005%. The Kraken2 supplemented with bracken, with a custom database demonstrated superior precision, sensitivity, F1 score, and overall sequence classification. Using a custom database, this classifier classified 99% of in-silico reads and 58% of real-world soil shotgun reads, with the latter identifying previously overlooked phyla using a custom database.

CONCLUSION

This study underscores the potential advantages of in-silico methodological optimization in metagenomic analyses, especially when deciphering the complexities of soil microbiomes. We demonstrate that the choice of classifier and database significantly impacts microbial taxonomic profiling. Our findings suggest that employing Kraken2 with Bracken, coupled with a custom database of GTDB-TK genomes and fungal genomes at a relative abundance threshold of 0.001% provides optimal accuracy in soil shotgun metagenome analysis.

摘要

背景

宏基因组学领域的最新研究成果,如人类微生物组计划和塔拉海洋项目,揭示了微生物群落的复杂性。强大的生物信息学流程和对其方法的细致评估促成了这些项目的成功。然而,土壤环境因其独特的挑战,需要专门的方法探索以最大化对微生物的认识。土壤微生物组研究的一个显著局限是缺乏可供分类器使用的、能模拟土壤群落复杂性的土壤特异性参考数据库。此外,也缺乏源自土壤菌株的体外模拟群落,无法用于评估分类准确性。

结果

在本研究中,我们生成了一个包含土壤微生物组中常见微生物基因组的定制虚拟模拟群落。利用这个模拟群落,我们模拟了鸟枪法测序数据,以评估三种领先的宏基因组分类器的性能:Kraken2(辅以Bracken,使用从GTDB-TK基因组衍生的定制数据库及其自身的默认数据库)、Kaiju和MetaPhlAn,并利用它们各自的默认数据库进行全面分析。我们的结果突出了优化分类参数、数据库选择以及分析修剪后的读段和重叠群的重要性。我们的研究表明,与包含微生物真核生物、原生动物或人类基因组的更广泛数据库相比,针对我们样本中存在的特定分类群定制的分类器导致的错误更少,这突出了靶向分类的有效性。值得注意的是,当应用0.001%或0.005%的相对丰度阈值时,可实现最佳分类器性能。辅以Bracken的Kraken2,使用定制数据库,展现出卓越的精度、灵敏度、F1分数和整体序列分类能力。使用定制数据库,该分类器对99%的虚拟读段和58%的实际土壤鸟枪法读段进行了分类,后者使用定制数据库识别出了之前被忽视的门。

结论

本研究强调了在宏基因组分析中进行虚拟方法优化的潜在优势,尤其是在解读土壤微生物组的复杂性时。我们证明了分类器和数据库的选择对微生物分类谱分析有显著影响。我们的研究结果表明,在相对丰度阈值为0.001%时,使用Kraken2和Bracken,并结合GTDB-TK基因组和真菌基因组的定制数据库,可在土壤鸟枪法宏基因组分析中提供最佳准确性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/26f2/10979606/5a136975f3d2/40793_2024_561_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验