Division of Microbial Ecology, Department of Microbiology and Ecosystem Science, University of Vienna, Vienna, Austria.
Division of Computational System Biology, Department of Microbiology and Ecosystem Science, University of Vienna, Vienna, Austria.
ISME J. 2014 Jan;8(1):115-25. doi: 10.1038/ismej.2013.142. Epub 2013 Aug 15.
In the era of metagenomics and amplicon sequencing, comprehensive analyses of available sequence data remain a challenge. Here we describe an approach exploiting metagenomic and amplicon data sets from public databases to elucidate phylogenetic diversity of defined microbial taxa. We investigated the phylum Chlamydiae whose known members are obligate intracellular bacteria that represent important pathogens of humans and animals, as well as symbionts of protists. Despite their medical relevance, our knowledge about chlamydial diversity is still scarce. Most of the nine known families are represented by only a few isolates, while previous clone library-based surveys suggested the existence of yet uncharacterized members of this phylum. Here we identified more than 22,000 high quality, non-redundant chlamydial 16S rRNA gene sequences in diverse databases, as well as 1900 putative chlamydial protein-encoding genes. Even when applying the most conservative approach, clustering of chlamydial 16S rRNA gene sequences into operational taxonomic units revealed an unexpectedly high species, genus and family-level diversity within the Chlamydiae, including 181 putative families. These in silico findings were verified experimentally in one Antarctic sample, which contained a high diversity of novel Chlamydiae. In our analysis, the Rhabdochlamydiaceae, whose known members infect arthropods, represents the most diverse and species-rich chlamydial family, followed by the protist-associated Parachlamydiaceae, and a putative new family (PCF8) with unknown host specificity. Available information on the origin of metagenomic samples indicated that marine environments contain the majority of the newly discovered chlamydial lineages, highlighting this environment as an important chlamydial reservoir.
在宏基因组学和扩增子测序的时代,对可用序列数据进行全面分析仍然是一个挑战。在这里,我们描述了一种利用公共数据库中的宏基因组和扩增子数据集来阐明定义的微生物分类群的系统发育多样性的方法。我们研究了衣原体门,其已知成员是专性细胞内细菌,代表人类和动物的重要病原体以及原生动物的共生体。尽管它们具有医学相关性,但我们对衣原体多样性的了解仍然很少。已知的九个科中大多数仅由少数几个分离株代表,而以前基于克隆文库的调查表明该门中存在尚未鉴定的成员。在这里,我们在各种数据库中鉴定了超过 22,000 个高质量,非冗余的衣原体 16S rRNA 基因序列,以及 1900 个推定的衣原体蛋白编码基因。即使应用最保守的方法,将衣原体 16S rRNA 基因序列聚类为操作分类单位,也揭示了衣原体中出乎意料的高物种,属和科水平的多样性,包括 181 个推定的科。这些计算机模拟的发现通过在一个南极样本中得到了实验验证,该样本中包含了高度多样化的新型衣原体。在我们的分析中,Rhabdochlamydiaceae 是已知感染节肢动物的衣原体科,代表最具多样性和物种丰富的衣原体科,其次是与原生动物相关的Parachlamydiaceae科和具有未知宿主特异性的推定新科(PCF8)。有关宏基因组样本来源的可用信息表明,海洋环境包含了大多数新发现的衣原体谱系,突出了该环境作为重要的衣原体库。