Gladstone Institute of Data Science and Biotechnology, San Francisco, CA, USA.
Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA.
PLoS Comput Biol. 2018 Aug 9;14(8):e1006242. doi: 10.1371/journal.pcbi.1006242. eCollection 2018 Aug.
The mechanisms by which different microbes colonize the healthy human gut versus other body sites, the gut in disease states, or other environments remain largely unknown. Identifying microbial genes influencing fitness in the gut could lead to new ways to engineer probiotics or disrupt pathogenesis. We approach this problem by measuring the statistical association between a species having a gene and the probability that the species is present in the gut microbiome. The challenge is that closely related species tend to be jointly present or absent in the microbiome and also share many genes, only a subset of which are involved in gut adaptation. We show that this phylogenetic correlation indeed leads to many false discoveries and propose phylogenetic linear regression as a powerful solution. To apply this method across the bacterial tree of life, where most species have not been experimentally phenotyped, we use metagenomes from hundreds of people to quantify each species' prevalence in and specificity for the gut microbiome. This analysis reveals thousands of genes potentially involved in adaptation to the gut across species, including many novel candidates as well as processes known to contribute to fitness of gut bacteria, such as acid tolerance in Bacteroidetes and sporulation in Firmicutes. We also find microbial genes associated with a preference for the gut over other body sites, which are significantly enriched for genes linked to fitness in an in vivo competition experiment. Finally, we identify gene families associated with higher prevalence in patients with Crohn's disease, including Proteobacterial genes involved in conjugation and fimbria regulation, processes previously linked to inflammation. These gene targets may represent new avenues for modulating host colonization and disease. Our strategy of combining metagenomics with phylogenetic modeling is general and can be used to identify genes associated with adaptation to any environment.
不同微生物在健康人体肠道中定植的机制,与其他身体部位、疾病状态下的肠道或其他环境中的定植机制在很大程度上仍不清楚。鉴定影响肠道适应性的微生物基因可能会带来新的方法来设计益生菌或破坏发病机制。我们通过测量物种拥有一个基因与该物种存在于肠道微生物组中的概率之间的统计关联来解决这个问题。挑战在于,密切相关的物种往往在微生物组中共同存在或不存在,并且共享许多基因,其中只有一部分与肠道适应有关。我们表明,这种系统发育相关性确实会导致许多假阳性发现,并提出了系统发育线性回归作为一种强大的解决方案。为了在细菌生命树中应用这种方法,其中大多数物种尚未进行实验表型鉴定,我们使用来自数百人的宏基因组来量化每个物种在肠道微生物组中的普遍性和特异性。这种分析揭示了数千个潜在涉及物种适应肠道的基因,包括许多新的候选基因以及已知有助于肠道细菌适应性的过程,例如拟杆菌门的耐酸性和厚壁菌门的孢子形成。我们还发现与肠道偏好相关的微生物基因,这些基因与体内竞争实验中的适应性显著相关。最后,我们确定了与克罗恩病患者更高普遍性相关的基因家族,包括与共轭和菌毛调节相关的 Proteobacterial 基因,这些过程先前与炎症有关。这些基因靶标可能代表调节宿主定植和疾病的新途径。我们将宏基因组学与系统发育建模相结合的策略是通用的,可以用于鉴定与任何环境适应相关的基因。