Yunyun Gao, Hao Luo, Hujie Lyu, Haifei Yang, Salsabeel Yousuf, Shi Huang, Yong-Xin Liu
Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518120, China.
Department of Life Sciences, Imperial College of London, London SW7 2AZ, UK.
Gigascience. 2025 Jan 6;14. doi: 10.1093/gigascience/giaf004.
The rapid evolution of metagenomic sequencing technology offers remarkable opportunities to explore the intricate roles of microbiome in host health and disease, as well as to uncover the unknown structure and functions of microbial communities. However, the swift accumulation of metagenomic data poses substantial challenges for data analysis. Contamination from host DNA can substantially compromise result accuracy and increase additional computational resources by including nontarget sequences.
In this study, we assessed the impact of computational host DNA decontamination on downstream analyses, highlighting its importance in producing accurate results efficiently. We also evaluated the performance of conventional tools like KneadData, Bowtie2, BWA, KMCP, Kraken2, and KrakenUniq, each offering unique advantages for different applications. Furthermore, we highlighted the importance of an accurate host reference genome, noting that its absence negatively affected the decontamination performance across all tools.
Our findings underscore the need for careful selection of decontamination tools and reference genomes to enhance the accuracy of metagenomic analyses. These insights provide valuable guidance for improving the reliability and reproducibility of microbiome research.
宏基因组测序技术的快速发展为探索微生物组在宿主健康和疾病中的复杂作用以及揭示微生物群落未知的结构和功能提供了显著机遇。然而,宏基因组数据的迅速积累给数据分析带来了巨大挑战。宿主DNA污染会通过纳入非目标序列严重损害结果准确性并增加额外的计算资源。
在本研究中,我们评估了计算去除宿主DNA对下游分析的影响,强调了其在高效产生准确结果方面的重要性。我们还评估了KneadData、Bowtie2、BWA、KMCP、Kraken2和KrakenUniq等传统工具的性能,每个工具在不同应用中都具有独特优势。此外,我们强调了准确的宿主参考基因组的重要性,指出其缺失会对所有工具的去除污染性能产生负面影响。
我们的研究结果强调了仔细选择去除污染工具和参考基因组以提高宏基因组分析准确性的必要性。这些见解为提高微生物组研究的可靠性和可重复性提供了有价值的指导。