Boodaghidizaji Miad, Jungles Thaisa, Chen Tingting, Zhang Bin, Yao Tianming, Landay Alan, Keshavarzian Ali, Hamaker Bruce, Ardekani Arezoo
School of Mechanical Engineering, Purdue University, 585 Purdue Mall, West Lafayette, IN, 47907, USA.
Department of Food Science, Whistler Center for Carbohydrate Research, Purdue University, West Lafayette, IN, 47907, USA.
BMC Microbiol. 2025 Jun 6;25(1):353. doi: 10.1186/s12866-025-04072-7.
Gut microbiota has been implicated in the pathogenesis of multiple gastrointestinal (GI) and systemic metabolic and inflammatory disorders where disrupted gut microbiota composition and function (dysbiosis) has been found in multiple studies. Thus, human microbiome data holds significant potential as a source of information for diagnosing and characterizing diseases-such as phenotypes, disease course, and therapeutic response-associated with dysbiotic microbiota communities. However, multiple attempts to leverage gut microbiota taxonomic data for diagnostic and disease characterization have failed due to significant inter-individual variability of microbiota community and overlap of disrupted microbiota communities among multiple diseases. One potential approach is to look at the microbiota community pattern and response to microbiota modifiers like dietary fiber in different disease states. This approach has become feasible with the advent of machine learning, which can uncover hidden patterns in human microbiome data and enable disease prediction. Accordingly, the aim of our study was to test the hypothesis that machine learning algorithms can distinguish stool microbiota patterns-and their responses to fiber-across diseases with previously reported overlapping dysbiotic microbiota profiles. Here, we applied machine learning algorithms to distinguish between Parkinson's disease, Crohn's disease (CD), ulcerative colitis (UC), human immune deficiency virus (HIV), and healthy control (HC) subjects in the presence and absence of fiber treatments. We demonstrated that machine learning algorithms can classify diseases with accuracy as high as 95%. Furthermore, applying machine learning to microbiome data to distinguish UC from CD yielded a prediction accuracy of up to 90%.
肠道微生物群已被证明与多种胃肠道(GI)以及全身性代谢和炎症性疾病的发病机制有关,多项研究发现这些疾病中存在肠道微生物群组成和功能的破坏(生态失调)。因此,人类微生物组数据作为一种信息来源,在诊断和表征与生态失调的微生物群群落相关的疾病(如表型、病程和治疗反应)方面具有巨大潜力。然而,由于微生物群群落存在显著的个体间变异性以及多种疾病中生态失调的微生物群群落存在重叠,利用肠道微生物群分类数据进行诊断和疾病表征的多次尝试均告失败。一种潜在的方法是观察不同疾病状态下微生物群群落模式以及对膳食纤维等微生物群调节剂的反应。随着机器学习的出现,这种方法变得可行,机器学习可以揭示人类微生物组数据中的隐藏模式并实现疾病预测。因此,我们研究的目的是检验这样一个假设,即机器学习算法能够区分出患有帕金森病、克罗恩病(CD)、溃疡性结肠炎(UC)、人类免疫缺陷病毒(HIV)的患者以及健康对照(HC)受试者的粪便微生物群模式及其对纤维的反应,这些疾病此前报告的生态失调微生物群谱存在重叠。在这里,我们应用机器学习算法在有纤维处理和无纤维处理的情况下区分帕金森病、克罗恩病(CD)、溃疡性结肠炎(UC)、人类免疫缺陷病毒(HIV)和健康对照(HC)受试者。我们证明机器学习算法能够以高达95%的准确率对疾病进行分类。此外,将机器学习应用于微生物组数据以区分UC和CD,预测准确率高达90%。