Department of Microbiology, Cornell University, Ithaca, New York, United States of America.
PLoS Comput Biol. 2013;9(1):e1002863. doi: 10.1371/journal.pcbi.1002863. Epub 2013 Jan 10.
Recent analyses of human-associated bacterial diversity have categorized individuals into 'enterotypes' or clusters based on the abundances of key bacterial genera in the gut microbiota. There is a lack of consensus, however, on the analytical basis for enterotypes and on the interpretation of these results. We tested how the following factors influenced the detection of enterotypes: clustering methodology, distance metrics, OTU-picking approaches, sequencing depth, data type (whole genome shotgun (WGS) vs.16S rRNA gene sequence data), and 16S rRNA region. We included 16S rRNA gene sequences from the Human Microbiome Project (HMP) and from 16 additional studies and WGS sequences from the HMP and MetaHIT. In most body sites, we observed smooth abundance gradients of key genera without discrete clustering of samples. Some body habitats displayed bimodal (e.g., gut) or multimodal (e.g., vagina) distributions of sample abundances, but not all clustering methods and workflows accurately highlight such clusters. Because identifying enterotypes in datasets depends not only on the structure of the data but is also sensitive to the methods applied to identifying clustering strength, we recommend that multiple approaches be used and compared when testing for enterotypes.
最近对人类相关细菌多样性的分析根据肠道微生物群中关键细菌属的丰度将个体分为“肠型”或聚类。然而,对于肠型的分析基础以及对这些结果的解释,尚未达成共识。我们测试了以下因素如何影响肠型的检测:聚类方法、距离度量、OTU 选择方法、测序深度、数据类型(全基因组鸟枪法 (WGS) 与 16S rRNA 基因序列数据)和 16S rRNA 区域。我们纳入了人类微生物组计划 (HMP) 和 16 项额外研究的 16S rRNA 基因序列,以及 HMP 和 MetaHIT 的 WGS 序列。在大多数身体部位,我们观察到关键属的丰度呈平滑梯度分布,而样本没有离散聚类。一些身体栖息地的样本丰度呈双峰(例如肠道)或多峰(例如阴道)分布,但并非所有聚类方法和工作流程都能准确突出此类聚类。由于识别数据集的肠型不仅取决于数据的结构,而且还取决于用于识别聚类强度的方法,因此我们建议在测试肠型时使用并比较多种方法。