Department of Biology, Lund University, Lund 22362, Sweden.
Department of Mathematics, Sheffield University, Sheffield S3 7RH, UK.
Genome Biol Evol. 2024 Nov 1;16(11). doi: 10.1093/gbe/evae209.
Over the past decade, sequencing data generated by large microbiome projects showed that taxa exhibit patchy geographical distribution, raising questions about the geospatial dynamics that shape natural microbiomes and the spread of antimicrobial resistance genes. Answering these questions requires distinguishing between local and nonlocal microorganisms and identifying the source sites for the latter. Predicting the source sites and migration routes of microbiota has been envisioned for decades but was hampered by the lack of data, tools, and understanding of the processes governing biodiversity. State-of-the-art biogeographical tools suffer from low resolution and cannot predict biogeographical patterns at a scale relevant to ecological, medical, or epidemiological applications. Analyzing urban, soil, and marine microorganisms, we found that some taxa exhibit regional-specific composition and abundance, suggesting they can be used as biogeographical biomarkers. We developed the microbiome geographic population structure, a machine learning-based tool that utilizes microbial relative sequence abundances to yield a fine-scale source site for microorganisms. Microbiome geographic population structure predicted the source city for 92% of the samples and the within-city source for 82% of the samples, though they were often only a few hundred meters apart. Microbiome geographic population structure also predicted soil and marine sampling sites for 86% and 74% of the samples, respectively. We demonstrated that microbiome geographic population structure differentiated local from nonlocal microorganisms and used it to trace the global spread of antimicrobial resistance genes. Microbiome geographic population structure's ability to localize samples to their water body, country, city, and transit stations opens new possibilities in tracing microbiomes and has applications in forensics, medicine, and epidemiology.
在过去的十年中,大型微生物组项目产生的测序数据表明,分类群表现出斑块状的地理分布,这引发了关于塑造自然微生物组和抗生素耐药基因传播的地理空间动态的问题。回答这些问题需要区分本地和非本地微生物,并确定后者的来源地点。几十年来,人们一直设想预测微生物群的来源地点和迁移路线,但由于缺乏数据、工具以及对控制生物多样性的过程的理解,这一设想一直受到阻碍。最先进的生物地理工具分辨率较低,无法预测与生态、医学或流行病学应用相关的生物地理模式。通过分析城市、土壤和海洋微生物,我们发现一些分类群表现出特定区域的组成和丰度,这表明它们可以用作生物地理生物标志物。我们开发了微生物组地理种群结构,这是一种基于机器学习的工具,利用微生物相对序列丰度来确定微生物的精细来源地点。微生物组地理种群结构预测了 92%的样本的来源城市,以及 82%的样本的城市内来源,尽管它们通常只有几百米远。微生物组地理种群结构还分别预测了土壤和海洋采样点,分别为 86%和 74%的样本。我们证明了微生物组地理种群结构能够区分本地和非本地微生物,并利用它来追踪抗生素耐药基因的全球传播。微生物组地理种群结构能够将样本定位到其水体、国家、城市和中转站的能力,为追踪微生物组开辟了新的可能性,并在法医学、医学和流行病学中有应用。