Farag Ibrahim F, Youssef Noha H, Elshahed Mostafa S
Department of Microbiology and Molecular Genetics, Oklahoma State University, Stillwater, Oklahoma, USA.
Department of Microbiology and Molecular Genetics, Oklahoma State University, Stillwater, Oklahoma, USA
Appl Environ Microbiol. 2017 May 1;83(10). doi: 10.1128/AEM.00521-17. Print 2017 May 15.
We investigated the global distribution patterns and pangenomic diversity of the candidate phylum "" (WS3) in 16S rRNA gene as well as metagenomic data sets. We document distinct distribution patterns for various "" orders in 16S rRNA gene data sets, with prevalence of orders sediment_1 in terrestrial, PBSIII_9 in groundwater and temperate freshwater, and GN03 in pelagic marine, saline-hypersaline, and wastewater habitats. Using a fragment recruitment approach, we identified 68.9 Mb of ""-affiliated contigs in publicly available metagenomic data sets comprising 73,079 proteins. Metabolic reconstruction suggests a prevalent saprophytic lifestyle in all "" orders, with marked capacities for the degradation of proteins, lipids, and polysaccharides predominant in plant, bacterial, fungal/crustacean, and eukaryotic algal cell walls. As well, extensive transport and central metabolic pathways for the metabolism of imported monomers were identified. Interestingly, genes and domains suggestive of the production of a cellulosome-e.g., protein-coding genes harboring dockerin I domains attached to a glycosyl hydrolase and scaffoldin-encoding genes harboring cohesin I and CBM37 domains-were identified in order PBSIII_9, GN03, and MSB-4E2 fragments recovered from four anoxic aquatic habitats; hence extending the cellulosomal production capabilities in beyond the Gram-positive In addition to fermentative pathways, a complete electron transport chain with terminal cytochrome oxidases Caa3 (for operation under high oxygen tension) and Cbb3 (for operation under low oxygen tension) were identified in PBSIII_9 and GN03 fragments recovered from oxygenated and partially/seasonally oxygenated aquatic habitats. Our metagenomic recruitment effort hence represents a comprehensive pangenomic view of this yet-uncultured phylum and provides insights broader than and complementary to those gained from genome recovery initiatives focusing on a single or few sampled environments. Our understanding of the phylogenetic diversity, metabolic capabilities, and ecological roles of yet-uncultured microorganisms is rapidly expanding. However, recent efforts mainly have been focused on recovering genomes of novel microbial lineages from a specific sampling site, rather than from a wide range of environmental habitats. To comprehensively evaluate the genomic landscape, putative metabolic capabilities, and ecological roles of yet-uncultured candidate phyla, efforts that focus on the recovery of genomic fragments from a wide range of habitats and that adequately sample the intraphylum diversity within a specific target lineage are needed. Here, we investigated the global distribution patterns and pangenomic diversity of the candidate phylum "" Our results document the preference of specific "" orders to specific habitats, the prevalence of plant polysaccharide degradation abilities within all "" orders, the occurrence of all genes/domains necessary for the production of cellulosomes within three "" orders (GN03, PBSIII_9, and MSB-4E2) in data sets recovered from anaerobic locations, and the identification of the components of an aerobic respiratory chain, as well as occurrence of multiple O-dependent metabolic reactions in "" orders GN03 and PBSIII_9 recovered from oxygenated habitats. The results demonstrate the value of phylocentric pangenomic surveys for understanding the global ecological distribution and panmetabolic abilities of yet-uncultured microbial lineages since they provide broader and more complementary insights than those gained from single-cell genomic and/or metagenomic-enabled genome recovery efforts focusing on a single sampling site.
我们在16S rRNA基因以及宏基因组数据集中研究了候选门“”(WS3)的全球分布模式和泛基因组多样性。我们记录了16S rRNA基因数据集中不同“”目独特的分布模式,其中sediment_1目在陆地环境中占优势,PBSIII_9目在地下水以及温带淡水环境中占优势,而GN03目在远洋海洋、盐度高盐度和废水生境中占优势。使用片段招募方法,我们在公开可用的宏基因组数据集中鉴定出68.9 Mb与“”相关的重叠群,这些重叠群包含73,079种蛋白质。代谢重建表明,在所有“”目中普遍存在腐生生活方式,在植物、细菌、真菌/甲壳类和真核藻类细胞壁中占主导地位的蛋白质、脂质和多糖的降解能力很强。此外,还鉴定出了用于输入单体代谢的广泛转运和中心代谢途径。有趣的是,在从四个缺氧水生生境中回收的PBSIII_9目、GN03目和MSB - 4E2片段中,鉴定出了提示纤维素体产生的基因和结构域,例如携带附着于糖基水解酶的dockerin I结构域的蛋白质编码基因以及携带cohesin I和CBM37结构域的支架蛋白编码基因;因此,将纤维素体产生能力扩展到了革兰氏阳性菌之外。除了发酵途径外,在从含氧和部分/季节性含氧水生生境中回收的PBSIII_9目和GN03目片段中,还鉴定出了具有末端细胞色素氧化酶Caa3(用于在高氧张力下运行)和Cbb3(用于在低氧张力下运行)的完整电子传递链。因此,我们的宏基因组招募工作代表了这个尚未培养的门的全面泛基因组视图,并提供了比从专注于单个或少数采样环境的基因组恢复计划中获得的见解更广泛且互补的见解。我们对尚未培养的微生物的系统发育多样性、代谢能力和生态作用的理解正在迅速扩展。然而,最近的努力主要集中在从特定采样位点而非广泛的环境生境中恢复新型微生物谱系的基因组。为了全面评估尚未培养的候选门的基因组景观、假定的代谢能力和生态作用,需要专注于从广泛生境中恢复基因组片段并充分采样特定目标谱系内门内多样性的努力。在这里,我们研究了候选门“”的全球分布模式和泛基因组多样性。我们的结果记录了特定“”目对特定生境的偏好、所有“”目内植物多糖降解能力的普遍性、在从厌氧位置回收的数据集中三个“”目(GN03、PBSIII_9和MSB - 4E2)中纤维素体产生所需的所有基因/结构域的出现情况,以及在从含氧生境中回收的GN03目和PBSIII_9目“”中需氧呼吸链成分的鉴定以及多个氧依赖性代谢反应的出现。结果表明,以系统发育为中心的泛基因组调查对于理解尚未培养的微生物谱系的全球生态分布和泛代谢能力具有价值,因为它们提供了比从专注于单个采样位点的单细胞基因组和/或宏基因组驱动的基因组恢复努力中获得的见解更广泛且更互补的见解。