文献检索文档翻译深度研究
Suppr Zotero 插件Zotero 插件
邀请有礼套餐&价格历史记录

新学期,新优惠

限时优惠:9月1日-9月22日

30天高级会员仅需29元

1天体验卡首发特惠仅需5.99元

了解详情
不再提醒
插件&应用
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
高级版
套餐订阅购买积分包
AI 工具
文献检索文档翻译深度研究
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2025

使用机器学习在微生物组数据中具有可重复性的生物标志物发现方法。

Methodology for biomarker discovery with reproducibility in microbiome data using machine learning.

机构信息

Division of Pharmacology, Utrecht Institute for Pharmaceutical Sciences, Faculty of Science, University of Utrecht, Utrecht, The Netherlands.

Department of Data Science, Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht, The Netherlands.

出版信息

BMC Bioinformatics. 2024 Jan 15;25(1):26. doi: 10.1186/s12859-024-05639-3.


DOI:10.1186/s12859-024-05639-3
PMID:38225565
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10789030/
Abstract

BACKGROUND: In recent years, human microbiome studies have received increasing attention as this field is considered a potential source for clinical applications. With the advancements in omics technologies and AI, research focused on the discovery for potential biomarkers in the human microbiome using machine learning tools has produced positive outcomes. Despite the promising results, several issues can still be found in these studies such as datasets with small number of samples, inconsistent results, lack of uniform processing and methodologies, and other additional factors lead to lack of reproducibility in biomedical research. In this work, we propose a methodology that combines the DADA2 pipeline for 16s rRNA sequences processing and the Recursive Ensemble Feature Selection (REFS) in multiple datasets to increase reproducibility and obtain robust and reliable results in biomedical research. RESULTS: Three experiments were performed analyzing microbiome data from patients/cases in Inflammatory Bowel Disease (IBD), Autism Spectrum Disorder (ASD), and Type 2 Diabetes (T2D). In each experiment, we found a biomarker signature in one dataset and applied to 2 other as further validation. The effectiveness of the proposed methodology was compared with other feature selection methods such as K-Best with F-score and random selection as a base line. The Area Under the Curve (AUC) was employed as a measure of diagnostic accuracy and used as a metric for comparing the results of the proposed methodology with other feature selection methods. Additionally, we use the Matthews Correlation Coefficient (MCC) as a metric to evaluate the performance of the methodology as well as for comparison with other feature selection methods. CONCLUSIONS: We developed a methodology for reproducible biomarker discovery for 16s rRNA microbiome sequence analysis, addressing the issues related with data dimensionality, inconsistent results and validation across independent datasets. The findings from the three experiments, across 9 different datasets, show that the proposed methodology achieved higher accuracy compared to other feature selection methods. This methodology is a first approach to increase reproducibility, to provide robust and reliable results.

摘要

背景:近年来,人类微生物组研究受到越来越多的关注,因为该领域被认为是临床应用的潜在来源。随着组学技术和人工智能的进步,使用机器学习工具在人类微生物组中发现潜在生物标志物的研究取得了积极的成果。尽管结果很有前景,但在这些研究中仍然存在一些问题,例如样本数量少的数据集、不一致的结果、缺乏统一的处理和方法以及其他额外的因素导致生物医学研究的可重复性差。在这项工作中,我们提出了一种方法,该方法结合了 16s rRNA 序列处理的 DADA2 管道和多个数据集的递归集成特征选择(REFS),以提高可重复性并在生物医学研究中获得稳健可靠的结果。

结果:进行了三个实验,分析了炎症性肠病(IBD)、自闭症谱系障碍(ASD)和 2 型糖尿病(T2D)患者/病例的微生物组数据。在每个实验中,我们在一个数据集中找到了一个生物标志物特征,并将其应用于另外两个数据集中进行进一步验证。所提出方法的有效性与其他特征选择方法(例如基于 F 分数的 K-Best 和随机选择作为基线)进行了比较。曲线下面积(AUC)被用作诊断准确性的度量标准,并用作比较所提出方法与其他特征选择方法的结果的指标。此外,我们使用马修斯相关系数(MCC)作为度量标准来评估该方法的性能以及与其他特征选择方法的比较。

结论:我们开发了一种用于 16s rRNA 微生物组序列分析的可重复生物标志物发现的方法,解决了与数据维度、不一致结果和跨独立数据集验证相关的问题。三个实验的结果,跨越 9 个不同的数据集,表明所提出的方法与其他特征选择方法相比,达到了更高的准确性。该方法是提高可重复性、提供稳健可靠结果的一种初步尝试。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e3b7/10789030/0ed2aa466023/12859_2024_5639_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e3b7/10789030/b1024e74c250/12859_2024_5639_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e3b7/10789030/be13cc2126c5/12859_2024_5639_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e3b7/10789030/f5632baea25f/12859_2024_5639_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e3b7/10789030/10a501f812eb/12859_2024_5639_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e3b7/10789030/0ed2aa466023/12859_2024_5639_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e3b7/10789030/b1024e74c250/12859_2024_5639_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e3b7/10789030/be13cc2126c5/12859_2024_5639_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e3b7/10789030/f5632baea25f/12859_2024_5639_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e3b7/10789030/10a501f812eb/12859_2024_5639_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e3b7/10789030/0ed2aa466023/12859_2024_5639_Fig5_HTML.jpg

相似文献

[1]
Methodology for biomarker discovery with reproducibility in microbiome data using machine learning.

BMC Bioinformatics. 2024-1-15

[2]
A robust microbiome signature for autism spectrum disorder across different studies using machine learning.

Sci Rep. 2024-1-8

[3]
Robust prediction of colorectal cancer via gut microbiome 16S rRNA sequencing data.

J Med Microbiol. 2024-10

[4]
Inflammatory bowel disease biomarkers of human gut microbiota selected via different feature selection methods.

PeerJ. 2022

[5]
A Framework for Effective Application of Machine Learning to Microbiome-Based Classification Problems.

mBio. 2020-6-9

[6]
Stable feature selection based on the ensemble L -norm support vector machine for biomarker discovery.

BMC Genomics. 2016-12-22

[7]
Machine learning-based feature selection to search stable microbial biomarkers: application to inflammatory bowel disease.

Gigascience. 2022-12-28

[8]
A Machine Learning-Based Analytic Pipeline Applied to Clinical and Serum IgG Immunoproteome Data To Predict Chlamydia trachomatis Genital Tract Ascension and Incident Infection in Women.

Microbiol Spectr. 2023-8-17

[9]
A Machine Learning Approach Reveals a Microbiota Signature for Infection with Mycobacterium avium subsp. in Cattle.

Microbiol Spectr. 2023-2-14

[10]
Robust biomarker screening from gene expression data by stable machine learning-recursive feature elimination methods.

Comput Biol Chem. 2022-10

引用本文的文献

[1]
Intestinal Microbiota and Fecal Transplantation in Patients with Inflammatory Bowel Disease and : An Updated Literature Review.

J Clin Med. 2025-7-25

[2]
Contributions of Artificial Intelligence to Analysis of Gut Microbiota in Autism Spectrum Disorder: A Systematic Review.

Children (Basel). 2024-7-31

[3]
A comprehensive overview of microbiome data in the light of machine learning applications: categorization, accessibility, and future directions.

Front Microbiol. 2024-2-13

本文引用的文献

[1]
Nasal Bacteriomes of Patients with Asthma and Allergic Rhinitis Show Unique Composition, Structure, Function and Interactions.

Microorganisms. 2023-3-7

[2]
Classifying asthma control using salivary and fecal bacterial microbiome in children with moderate-to-severe asthma.

Pediatr Allergy Immunol. 2023-2

[3]
The influence of machine learning technologies in gut microbiome research and cancer studies - A review.

Life Sci. 2022-12-15

[4]
Unique Pakistani gut microbiota highlights population-specific microbiota signatures of type 2 diabetes mellitus.

Gut Microbes. 2022

[5]
Gut Microbiome in Colorectal Cancer: Clinical Diagnosis and Treatment.

Genomics Proteomics Bioinformatics. 2023-2

[6]
Predicting cancer immunotherapy response from gut microbiomes using machine learning models.

Oncotarget. 2022

[7]
Microbiome Analysis via OTU and ASV-Based Pipelines-A Comparative Interpretation of Ecological Data in WWTP Systems.

Bioengineering (Basel). 2022-3-29

[8]
Application of machine learning tools: Potential and useful approach for the prediction of type 2 diabetes mellitus based on the gut microbiome profile.

Exp Ther Med. 2022-4

[9]
Gut microbiome alteration as a diagnostic tool and associated with inflammatory response marker in primary liver cancer.

Hepatol Int. 2022-2

[10]
Characteristics of Fecal Microbiota and Machine Learning Strategy for Fecal Invasive Biomarkers in Pediatric Inflammatory Bowel Disease.

Front Cell Infect Microbiol. 2021

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

推荐工具

医学文档翻译智能文献检索