Department of Ecology and Evolutionary Biology, Princeton University, Princeton, NJ, 08540, USA.
Department of Ecology and Evolutionary Biology, Cornell University, Ithaca, NY, 14853, USA.
BMC Genomics. 2024 Nov 4;25(1):1033. doi: 10.1186/s12864-024-10956-1.
Advances in assembling microbial genomes have led to growth of reference genome databases, which have been transformative for applied and basic microbiome research. Here we show that published microbial genome databases from humans, mice, cows, pigs, fish, honeybees, and marine environments contain significant sequencing-adapter contamination that systematically reduces assembly accuracy and contiguousness. By removing the adapter-contaminated ends of contiguous sequences and reassembling MGnify reference genomes, we improve the quality of assemblies in these databases.
组装微生物基因组方面的进展导致了参考基因组数据库的增长,这对应用和基础微生物组研究产生了变革性的影响。在这里,我们表明,来自人类、小鼠、牛、猪、鱼、蜜蜂和海洋环境的已发表微生物基因组数据库含有大量测序接头污染,这些污染会系统地降低组装的准确性和连续性。通过去除连续序列的接头污染端并重新组装 MGnify 参考基因组,我们提高了这些数据库中组装的质量。