Foo Aidan, Cerdeira Louise, Hughes Grant L, Heinz Eva
Vector Biology and Tropical Disease Biology, Liverpool School of Tropical Medicine, Liverpool, L3 5QA, UK.
Vector Biology, Liverpool School of Tropical Medicine, Liverpool, L3 5QA, UK.
Wellcome Open Res. 2023 May 26;8:131. doi: 10.12688/wellcomeopenres.19155.2. eCollection 2023.
Ongoing research of the mosquito microbiome aims to uncover novel strategies to reduce pathogen transmission. Sequencing costs, especially for metagenomics, are however still significant. A resource that is increasingly used to gain insights into host-associated microbiomes is the large amount of publicly available genomic data based on whole organisms like mosquitoes, which includes sequencing reads of the host-associated microbes and provides the opportunity to gain additional value from these initially host-focused sequencing projects. To analyse non-host reads from existing genomic data, we developed a snakemake workflow called MINUUR (Microbial INsights Using Unmapped Reads). Within MINUUR, reads derived from the host-associated microbiome were extracted and characterised using taxonomic classifications and metagenome assembly followed by binning and quality assessment. We applied this pipeline to five publicly available genomic datasets, consisting of 62 samples with a broad range of sequencing depths. : We demonstrate that MINUUR recovers previously identified phyla and genera and is able to extract bacterial metagenome assembled genomes (MAGs) associated to the microbiome. Of these MAGS, 42 are high-quality representatives with >90% completeness and <5% contamination. These MAGs improve the genomic representation of the mosquito microbiome and can be used to facilitate genomic investigation of key genes of interest. Furthermore, we show that samples with a high number of KRAKEN2 assigned reads produce more MAGs. : Our metagenomics workflow, MINUUR, was applied to a range of genomic samples to characterise microbiome-associated reads. We confirm the presence of key mosquito-associated symbionts that have previously been identified in other studies and recovered high-quality bacterial MAGs. In addition, MINUUR and its associated documentation are freely available on GitHub and provide researchers with a convenient workflow to investigate microbiome data included in the sequencing data for any applicable host genome of interest.
对蚊子微生物组的持续研究旨在探索减少病原体传播的新策略。然而,测序成本,尤其是宏基因组学的测序成本仍然很高。一种越来越多地用于深入了解宿主相关微生物组的资源是大量基于蚊子等完整生物体的公开可用基因组数据,其中包括宿主相关微生物的测序读数,并提供了从这些最初以宿主为重点的测序项目中获取额外价值的机会。为了分析现有基因组数据中的非宿主读数,我们开发了一个名为MINUUR(使用未映射读数的微生物洞察)的Snakemake工作流程。在MINUUR中,提取源自宿主相关微生物组的读数,并使用分类学分类和宏基因组组装进行表征,然后进行分箱和质量评估。我们将此流程应用于五个公开可用的基因组数据集,这些数据集由62个具有广泛测序深度的样本组成。我们证明MINUUR能够恢复先前鉴定的门和属,并能够提取与微生物组相关的细菌宏基因组组装基因组(MAG)。在这些MAG中,42个是高质量的代表,完整性>90%,污染率<5%。这些MAG改善了蚊子微生物组的基因组代表性,可用于促进对感兴趣的关键基因的基因组研究。此外,我们表明,具有大量KRAKEN2分配读数的样本会产生更多的MAG。我们的宏基因组学工作流程MINUUR被应用于一系列基因组样本,以表征与微生物组相关的读数。我们证实了先前在其他研究中鉴定出的关键蚊子相关共生体的存在,并获得了高质量的细菌MAG。此外,MINUUR及其相关文档可在GitHub上免费获取,并为研究人员提供了一个方便的工作流程,以研究任何适用的感兴趣宿主基因组的测序数据中包含的微生物组数据。