Department of Human Genetics, Leiden University Medical Center, Leiden, the Netherlands.
Department of Human Genetics, Leiden University Medical Center, Leiden, the Netherlands.
Forensic Sci Int Genet. 2020 May;46:102257. doi: 10.1016/j.fsigen.2020.102257. Epub 2020 Feb 5.
The assessment of microbiome biodiversity is the most common application of metagenomics. While 16S sequencing remains standard procedure for taxonomic profiling of metagenomic data, a growing number of studies have clearly demonstrated biases associated with this method. By using Whole Genome Shotgun sequencing (WGS) metagenomics, most of the known restrictions associated with 16S data are alleviated. However, due to the computationally intensive data analyses and higher sequencing costs, WGS based metagenomics remains a less popular option. Selecting the experiment type that provides a comprehensive, yet manageable amount of information is a challenge encountered in many metagenomics studies. In this work, we created a series of artificial bacterial mixes, each with a different distribution of skin-associated microbial species. These mixes were used to estimate the resolution of two different metagenomic experiments - 16S and WGS - and to evaluate several different bioinformatics approaches for taxonomic read classification. In all test cases, WGS approaches provide much more accurate results, in terms of taxa prediction and abundance estimation, in comparison to those of 16S. Furthermore, we demonstrate that a 16S dataset, analysed using different state of the art techniques and reference databases, can produce widely different results. In light of the fact that most forensic metagenomic analysis are still performed using 16S data, our results are especially important.
微生物组生物多样性的评估是宏基因组学最常见的应用。虽然 16S 测序仍然是对宏基因组数据进行分类分析的标准程序,但越来越多的研究清楚地表明了该方法存在的偏差。通过使用全基因组鸟枪法测序(WGS)宏基因组学,大多数与 16S 数据相关的已知限制都得到了缓解。然而,由于数据分析计算量大和测序成本较高,基于 WGS 的宏基因组学仍然是一种不太受欢迎的选择。在许多宏基因组学研究中,选择提供全面但易于管理信息量的实验类型是一个挑战。在这项工作中,我们创建了一系列人工细菌混合物,每个混合物中具有不同分布的皮肤相关微生物物种。这些混合物用于估计两种不同宏基因组实验(16S 和 WGS)的分辨率,并评估几种不同的分类学读分类的生物信息学方法。在所有测试案例中,WGS 方法在分类预测和丰度估计方面提供了比 16S 更准确的结果。此外,我们证明了使用不同的最先进技术和参考数据库分析的 16S 数据集可以产生广泛不同的结果。鉴于大多数法医宏基因组分析仍使用 16S 数据进行,我们的结果尤为重要。