基于序列的人类微生物组机器学习分析预测克罗恩病

Crohn's Disease Prediction Using Sequence Based Machine Learning Analysis of Human Microbiome.

作者信息

Unal Metehan, Bostanci Erkan, Ozkul Ceren, Acici Koray, Asuroglu Tunc, Guzel Mehmet Serdar

机构信息

Department of Computer Engineering, Ankara University, 06830 Ankara, Turkey.

Department of Pharmaceutical Microbiology, Faculty of Pharmacy, Hacettepe University, 06230 Ankara, Turkey.

出版信息

Diagnostics (Basel). 2023 Sep 1;13(17):2835. doi: 10.3390/diagnostics13172835.

DOI:10.3390/diagnostics13172835

PMID:37685376

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10486516/

Abstract

Human microbiota refers to the trillions of microorganisms that inhabit our bodies and have been discovered to have a substantial impact on human health and disease. By sampling the microbiota, it is possible to generate massive quantities of data for analysis using Machine Learning algorithms. In this study, we employed several modern Machine Learning techniques to predict Inflammatory Bowel Disease using raw sequence data. The dataset was obtained from NCBI preprocessed graph representations and converted into a structured form. Seven well-known Machine Learning frameworks, including Random Forest, Support Vector Machines, Extreme Gradient Boosting, Light Gradient Boosting Machine, Gaussian Naïve Bayes, Logistic Regression, and k-Nearest Neighbor, were used. Grid Search was employed for hyperparameter optimization. The performance of the Machine Learning models was evaluated using various metrics such as accuracy, precision, fscore, kappa, and area under the receiver operating characteristic curve. Additionally, Mc Nemar's test was conducted to assess the statistical significance of the experiment. The data was constructed using k-mer lengths of 3, 4 and 5. The Light Gradient Boosting Machine model overperformed over other models with 67.24%, 74.63% and 76.47% accuracy for k-mer lengths of 3, 4 and 5, respectively. The LightGBM model also demonstrated the best performance in each metric. The study showed promising results predicting disease from raw sequence data. Finally, Mc Nemar's test results found statistically significant differences between different Machine Learning approaches.

摘要

人类微生物群是指栖息在我们体内的数万亿微生物，现已发现它们对人类健康和疾病有着重大影响。通过对微生物群进行采样，可以生成大量数据，以便使用机器学习算法进行分析。在本研究中，我们采用了几种现代机器学习技术，利用原始序列数据预测炎症性肠病。数据集是从NCBI获得的，经过预处理后转换为图形表示形式，再转化为结构化形式。我们使用了七种著名的机器学习框架，包括随机森林、支持向量机、极端梯度提升、轻量级梯度提升机、高斯朴素贝叶斯、逻辑回归和k近邻。采用网格搜索进行超参数优化。使用各种指标（如准确率、精确率、F值、kappa值和受试者工作特征曲线下面积）评估机器学习模型的性能。此外，还进行了麦克尼马尔检验以评估实验的统计学意义。数据是使用长度为3、4和5的k-mer构建的。轻量级梯度提升机模型在k-mer长度为3、4和5时，准确率分别为67.24%、74.63%和76.47%，优于其他模型。LightGBM模型在各项指标上也表现出最佳性能。该研究表明，从原始序列数据预测疾病取得了有前景的结果。最后，麦克尼马尔检验结果发现不同机器学习方法之间存在统计学上的显著差异。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6a79/10486516/808ec513a99d/diagnostics-13-02835-g001.jpg

相似文献

Crohn's Disease Prediction Using Sequence Based Machine Learning Analysis of Human Microbiome.基于序列的人类微生物组机器学习分析预测克罗恩病

Diagnostics (Basel). 2023 Sep 1;13(17):2835. doi: 10.3390/diagnostics13172835.

A Risk Prediction Model for Physical Restraints Among Older Chinese Adults in Long-term Care Facilities: Machine Learning Study.长期护理机构中老年人身体约束的风险预测模型：机器学习研究。

J Med Internet Res. 2023 Apr 6;25:e43815. doi: 10.2196/43815.

Explainable Machine Learning Techniques To Predict Amiodarone-Induced Thyroid Dysfunction Risk: Multicenter, Retrospective Study With External Validation.可解释机器学习技术预测胺碘酮诱导甲状腺功能障碍风险：多中心回顾性研究及外部验证。

J Med Internet Res. 2023 Feb 7;25:e43734. doi: 10.2196/43734.

Which supervised machine learning algorithm can best predict achievement of minimum clinically important difference in neck pain after surgery in patients with cervical myelopathy? A QOD study.哪种监督机器学习算法最能预测颈椎脊髓病患者手术后颈部疼痛达到最小临床重要差异？一项 QOD 研究。

Neurosurg Focus. 2023 Jun;54(6):E5. doi: 10.3171/2023.3.FOCUS2372.

Comparison of Ruptured Intracranial Aneurysms Identification Using Different Machine Learning Algorithms and Radiomics.使用不同机器学习算法和放射组学识别破裂颅内动脉瘤的比较

Diagnostics (Basel). 2023 Aug 9;13(16):2627. doi: 10.3390/diagnostics13162627.

Machine Learning Hybrid Model for the Prediction of Chronic Kidney Disease.机器学习混合模型预测慢性肾脏病。

Comput Intell Neurosci. 2023 Mar 14;2023:9266889. doi: 10.1155/2023/9266889. eCollection 2023.

Machine Learning Predictive Models for Coronary Artery Disease.用于冠状动脉疾病的机器学习预测模型

SN Comput Sci. 2021;2(5):350. doi: 10.1007/s42979-021-00731-4. Epub 2021 Jun 22.

Machine learning model-based risk prediction of severe complications after off-pump coronary artery bypass grafting.基于机器学习模型的非体外循环冠状动脉搭桥术后严重并发症风险预测

Adv Clin Exp Med. 2023 Feb;32(2):185-194. doi: 10.17219/acem/152895.

Can Predictive Modeling Tools Identify Patients at High Risk of Prolonged Opioid Use After ACL Reconstruction?预测模型工具能否识别 ACL 重建术后阿片类药物使用时间延长的高风险患者？

Clin Orthop Relat Res. 2020 Jul;478(7):0-1618. doi: 10.1097/CORR.0000000000001251.

A systematic comparison of machine learning algorithms to develop and validate prediction model to predict heart failure risk in middle-aged and elderly patients with periodontitis (NHANES 2009 to 2014).一种系统的机器学习算法比较，旨在开发和验证预测模型，以预测中老年人牙周炎患者的心力衰竭风险（NHANES 2009 至 2014 年）。

Medicine (Baltimore). 2023 Aug 25;102(34):e34878. doi: 10.1097/MD.0000000000034878.

引用本文的文献

IgG4 unveiled: navigating the interplay with Crohn's disease - from immunology insights to machine learning.揭开IgG4的面纱：探索其与克罗恩病的相互作用——从免疫学见解到机器学习

Ann Med Surg (Lond). 2025 Jul 24;87(9):5798-5806. doi: 10.1097/MS9.0000000000003633. eCollection 2025 Sep.

Constructing inflammatory bowel disease diagnostic models based on k-mer and machine learning.基于k-mer和机器学习构建炎症性肠病诊断模型

Front Microbiol. 2025 Jun 25;16:1578005. doi: 10.3389/fmicb.2025.1578005. eCollection 2025.

Unveiling the Power of Gut Microbiome in Predicting Neoadjuvant Immunochemotherapy Responses in Esophageal Squamous Cell Carcinoma.揭示肠道微生物群在预测食管鳞状细胞癌新辅助免疫化疗反应中的作用

Research (Wash D C). 2024 Nov 14;7:0529. doi: 10.34133/research.0529. eCollection 2024.

Insights into Therapeutic Response Prediction for Ustekinumab in Ulcerative Colitis Using an Ensemble Bioinformatics Approach.基于集成生物信息学方法探究乌司奴单抗治疗溃疡性结肠炎的疗效预测。

Int J Mol Sci. 2024 May 18;25(10):5532. doi: 10.3390/ijms25105532.

本文引用的文献

Machine Learning Based Microbiome Signature to Predict Inflammatory Bowel Disease Subtypes.基于机器学习的微生物组特征预测炎症性肠病亚型

Front Microbiol. 2022 May 17;13:872671. doi: 10.3389/fmicb.2022.872671. eCollection 2022.

Microbiota in health and diseases.肠道菌群与健康和疾病。

Signal Transduct Target Ther. 2022 Apr 23;7(1):135. doi: 10.1038/s41392-022-00974-4.

Interpretable prediction of necrotizing enterocolitis from machine learning analysis of premature infant stool microbiota.基于机器学习分析早产儿粪便微生物组预测坏死性小肠结肠炎

BMC Bioinformatics. 2022 Mar 25;23(1):104. doi: 10.1186/s12859-022-04618-w.

Development and evaluation of a colorectal cancer screening method using machine learning-based gut microbiota analysis.基于机器学习的肠道微生物组分析的结直肠癌筛查方法的开发和评估。

Cancer Med. 2022 Aug;11(16):3194-3206. doi: 10.1002/cam4.4671. Epub 2022 Mar 22.

The Potential Role of Gut Microbiota in Alzheimer's Disease: From Diagnosis to Treatment.肠道微生物群在阿尔茨海默病中的作用：从诊断到治疗。

Nutrients. 2022 Feb 5;14(3):668. doi: 10.3390/nu14030668.

The Gut Microbiota in Inflammatory Bowel Disease.肠道微生物群与炎症性肠病。

Front Cell Infect Microbiol. 2022 Feb 22;12:733992. doi: 10.3389/fcimb.2022.733992. eCollection 2022.

Gut Microbiota Is a Potential Biomarker in Inflammatory Bowel Disease.肠道微生物群是炎症性肠病的一种潜在生物标志物。

Front Nutr. 2022 Jan 21;8:818902. doi: 10.3389/fnut.2021.818902. eCollection 2021.

Minimizer-space de Bruijn graphs: Whole-genome assembly of long reads in minutes on a personal computer.最小化空间 de Bruijn 图：在个人计算机上数分钟内完成长读段的全基因组组装。

Cell Syst. 2021 Oct 20;12(10):958-968.e6. doi: 10.1016/j.cels.2021.08.009. Epub 2021 Sep 14.

Predicting drug-microbiome interactions with machine learning.用机器学习预测药物-微生物组相互作用。

Biotechnol Adv. 2022 Jan-Feb;54:107797. doi: 10.1016/j.biotechadv.2021.107797. Epub 2021 Jul 11.

Applications of Machine Learning in Human Microbiome Studies: A Review on Feature Selection, Biomarker Identification, Disease Prediction and Treatment.机器学习在人类微生物组研究中的应用：特征选择、生物标志物识别、疾病预测与治疗综述

Front Microbiol. 2021 Feb 19;12:634511. doi: 10.3389/fmicb.2021.634511. eCollection 2021.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

基于序列的人类微生物组机器学习分析预测克罗恩病

Crohn's Disease Prediction Using Sequence Based Machine Learning Analysis of Human Microbiome.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献