Environmental Microbiome Engineering and Biotechnology Laboratory, The University of Hong Kong, Hong Kong SAR, China.
State Environmental Protection Key Laboratory of Integrated Surface Water-Groundwater Pollution Control, School of Environmental Science and Engineering, Southern University of Science and Technology, Shenzhen, China.
Microbiome. 2020 Nov 6;8(1):155. doi: 10.1186/s40168-020-00937-3.
Genome-centric approaches are widely used to investigate microbial compositions, dynamics, ecology, and interactions within various environmental systems. Hundreds or even thousands of genomes could be retrieved in a single study contributed by the cost-effective short-read sequencing and developed assembly/binning pipelines. However, conventional binning methods usually yield highly fragmented draft genomes that limit our ability to comprehensively understand these microbial communities. Thus, to leverage advantage of both the long and short reads to retrieve more complete genomes from environmental samples is a must-do task to move this direction forward.
Here, we used an iterative hybrid assembly (IHA) approach to reconstruct 49 metagenome-assembled genomes (MAGs), including 27 high-quality (HQ) and high-contiguity (HC) genomes with contig number ≤ 5, eight of which were circular finished genomes from a partial-nitritation anammox (PNA) reactor. These 49 recovered MAGs (43 MAGs encoding full-length rRNA, average N50 of 2.2 Mbp), represented the majority (92.3%) of the bacterial community. Moreover, the workflow retrieved HQ and HC MAGs even with an extremely low coverage (relative abundance < 0.1%). Among them, 34 MAGs could not be assigned to the genus level, indicating the novelty of the genomes retrieved using the IHA method proposed in this study. Comparative analysis of HQ MAG pairs reconstructed using two methods, i.e., hybrid and short reads only, revealed that identical genes in the MAG pairs represented 87.5% and 95.5% of the total gene inventory of hybrid and short reads only assembled MAGs, respectively. In addition, the first finished anammox genome of the genus Ca. Brocadia reconstructed revealed that there were two identical hydrazine synthase (hzs) genes, providing the exact gene copy number of this crucial phylomarker of anammox at the genome level.
Our results showcased the high-quality and high-contiguity genome retrieval performance and demonstrated the feasibility of complete genome reconstruction using the IHA workflow from the enrichment system. These (near-) complete genomes provided a high resolution of the microbial community, which might help to understand the bacterial repertoire of anammox-associated systems. Combined with other validation experiments, the workflow can enable a detailed view of the anammox or other similar enrichment systems. Video Abstract.
以基因组为中心的方法广泛应用于研究各种环境系统中的微生物组成、动态、生态和相互作用。在单个研究中,可以通过具有成本效益的短读测序和开发的组装/分箱管道获得数百甚至数千个基因组。然而,传统的分箱方法通常会产生高度碎片化的草图基因组,限制了我们全面理解这些微生物群落的能力。因此,利用长读长和短读长的优势从环境样本中获取更完整的基因组是向前推进的必要任务。
在这里,我们使用迭代混合组装(IHA)方法重建了 49 个宏基因组组装基因组(MAG),其中包括 27 个高质量(HQ)和高连续性(HC)基因组,这些基因组的 contig 数量≤5,其中 8 个来自部分硝化厌氧氨氧化(PNA)反应器的环形完成基因组。这 49 个回收的 MAG(43 个 MAG 编码全长 rRNA,平均 N50 为 2.2 Mbp)代表了细菌群落的大部分(92.3%)。此外,即使在极低的覆盖率(相对丰度<0.1%)下,该工作流程也能回收 HQ 和 HC MAG。其中,34 个 MAG 无法分配到属水平,表明使用本研究中提出的 IHA 方法回收的基因组具有新颖性。使用两种方法(混合和仅短读)重建的 HQ MAG 对的比较分析表明,HQ MAG 对中相同的基因分别占混合和仅短读组装 MAG 中总基因库的 87.5%和 95.5%。此外,重建的第一个完成的 Ca. Brocadia 属厌氧氨氧化基因组表明,存在两个相同的肼合酶(hzs)基因,这为厌氧氨氧化的关键生物标志物在基因组水平上的准确基因拷贝数提供了依据。
我们的结果展示了高质量和高连续性的基因组回收性能,并证明了使用 IHA 工作流程从富集系统中进行完整基因组重建的可行性。这些(近)完整基因组提供了微生物群落的高分辨率,这有助于理解与厌氧氨氧化相关系统中的细菌组成。结合其他验证实验,该工作流程可以实现对厌氧氨氧化或其他类似富集系统的详细观察。