Gladstone Institute of Data Science and Biotechnology, San Francisco, CA, USA.
Institute for Human Genetics, University of California, San Francisco, CA, USA.
Microbiome. 2021 Mar 3;9(1):58. doi: 10.1186/s40168-021-01015-y.
Microbial eukaryotes are found alongside bacteria and archaea in natural microbial systems, including host-associated microbiomes. While microbial eukaryotes are critical to these communities, they are challenging to study with shotgun sequencing techniques and are therefore often excluded.
Here, we present EukDetect, a bioinformatics method to identify eukaryotes in shotgun metagenomic sequencing data. Our approach uses a database of 521,824 universal marker genes from 241 conserved gene families, which we curated from 3713 fungal, protist, non-vertebrate metazoan, and non-streptophyte archaeplastida genomes and transcriptomes. EukDetect has a broad taxonomic coverage of microbial eukaryotes, performs well on low-abundance and closely related species, and is resilient against bacterial contamination in eukaryotic genomes. Using EukDetect, we describe the spatial distribution of eukaryotes along the human gastrointestinal tract, showing that fungi and protists are present in the lumen and mucosa throughout the large intestine. We discover that there is a succession of eukaryotes that colonize the human gut during the first years of life, mirroring patterns of developmental succession observed in gut bacteria. By comparing DNA and RNA sequencing of paired samples from human stool, we find that many eukaryotes continue active transcription after passage through the gut, though some do not, suggesting they are dormant or nonviable. We analyze metagenomic data from the Baltic Sea and find that eukaryotes differ across locations and salinity gradients. Finally, we observe eukaryotes in Arabidopsis leaf samples, many of which are not identifiable from public protein databases.
EukDetect provides an automated and reliable way to characterize eukaryotes in shotgun sequencing datasets from diverse microbiomes. We demonstrate that it enables discoveries that would be missed or clouded by false positives with standard shotgun sequence analysis. EukDetect will greatly advance our understanding of how microbial eukaryotes contribute to microbiomes. Video abstract.
微生物真核生物与细菌和古菌一起存在于自然微生物系统中,包括宿主相关的微生物组。虽然微生物真核生物对这些群落至关重要,但由于它们难以通过鸟枪法测序技术进行研究,因此通常被排除在外。
在这里,我们提出了 EukDetect,这是一种用于鉴定鸟枪法宏基因组测序数据中真核生物的生物信息学方法。我们的方法使用了一个由 241 个保守基因家族中的 521,824 个通用标记基因组成的数据库,这些基因是从 3713 个真菌、原生生物、非脊椎后生动物和非绿藻质体的基因组和转录组中精选出来的。EukDetect 对微生物真核生物具有广泛的分类覆盖范围,在低丰度和密切相关的物种上表现良好,并且能够抵御真核生物基因组中的细菌污染。使用 EukDetect,我们描述了真核生物在人类胃肠道中的空间分布,表明真菌和原生生物存在于整个大肠的腔和粘膜中。我们发现,在生命的头几年里,有一系列真核生物定植在人类肠道中,这反映了在肠道细菌中观察到的发育演替模式。通过比较来自人类粪便的 DNA 和 RNA 测序的配对样本,我们发现许多真核生物在通过肠道后继续进行活跃的转录,尽管有些真核生物没有,这表明它们是休眠的或没有活力的。我们分析了波罗的海的宏基因组数据,发现真核生物在不同地点和盐度梯度上存在差异。最后,我们在拟南芥叶片样本中观察到真核生物,其中许多真核生物无法从公共蛋白质数据库中识别。
EukDetect 为从各种微生物组中鸟枪法测序数据集中鉴定真核生物提供了一种自动和可靠的方法。我们证明,它能够发现用标准鸟枪法序列分析可能会错过或混淆的发现。EukDetect 将极大地促进我们对微生物真核生物如何为微生物组做出贡献的理解。