• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

从人类微生物组数据中进行宿主表型分类主要受微生物分类群的存在驱动。

Host phenotype classification from human microbiome data is mainly driven by the presence of microbial taxa.

机构信息

Department of Agricultural Sciences, University of Naples Federico II, Portici, Italy.

Task Force on Microbiome Studies, University of Naples Federico II, Naples, Italy.

出版信息

PLoS Comput Biol. 2022 Apr 21;18(4):e1010066. doi: 10.1371/journal.pcbi.1010066. eCollection 2022 Apr.

DOI:10.1371/journal.pcbi.1010066
PMID:35446845
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9064115/
Abstract

Machine learning-based classification approaches are widely used to predict host phenotypes from microbiome data. Classifiers are typically employed by considering operational taxonomic units or relative abundance profiles as input features. Such types of data are intrinsically sparse, which opens the opportunity to make predictions from the presence/absence rather than the relative abundance of microbial taxa. This also poses the question whether it is the presence rather than the abundance of particular taxa to be relevant for discrimination purposes, an aspect that has been so far overlooked in the literature. In this paper, we aim at filling this gap by performing a meta-analysis on 4,128 publicly available metagenomes associated with multiple case-control studies. At species-level taxonomic resolution, we show that it is the presence rather than the relative abundance of specific microbial taxa to be important when building classification models. Such findings are robust to the choice of the classifier and confirmed by statistical tests applied to identifying differentially abundant/present taxa. Results are further confirmed at coarser taxonomic resolutions and validated on 4,026 additional 16S rRNA samples coming from 30 public case-control studies.

摘要

基于机器学习的分类方法被广泛用于从微生物组数据中预测宿主表型。分类器通常通过考虑操作分类单元或相对丰度谱作为输入特征来使用。这种类型的数据本质上是稀疏的,这为从微生物分类群的存在/不存在而不是相对丰度进行预测提供了机会。这也提出了一个问题,即对于区分目的,是否是特定分类群的存在而不是丰度相关,这一方面在文献中迄今为止被忽视了。在本文中,我们旨在通过对 4128 个与多个病例对照研究相关的公开宏基因组进行荟萃分析来填补这一空白。在物种水平的分类分辨率下,我们表明,在构建分类模型时,重要的是特定微生物分类群的存在而不是相对丰度。这些发现不受分类器选择的影响,并通过应用于识别差异丰富/存在的分类群的统计检验得到证实。结果在更粗的分类分辨率下得到进一步证实,并在来自 30 个公共病例对照研究的另外 4026 个 16S rRNA 样本上得到验证。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/956a/9064115/55e99a88b847/pcbi.1010066.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/956a/9064115/349ca0777131/pcbi.1010066.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/956a/9064115/62954097c163/pcbi.1010066.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/956a/9064115/8204e81d6fe0/pcbi.1010066.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/956a/9064115/577f8162741c/pcbi.1010066.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/956a/9064115/d3661889e350/pcbi.1010066.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/956a/9064115/55e99a88b847/pcbi.1010066.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/956a/9064115/349ca0777131/pcbi.1010066.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/956a/9064115/62954097c163/pcbi.1010066.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/956a/9064115/8204e81d6fe0/pcbi.1010066.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/956a/9064115/577f8162741c/pcbi.1010066.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/956a/9064115/d3661889e350/pcbi.1010066.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/956a/9064115/55e99a88b847/pcbi.1010066.g006.jpg

相似文献

1
Host phenotype classification from human microbiome data is mainly driven by the presence of microbial taxa.从人类微生物组数据中进行宿主表型分类主要受微生物分类群的存在驱动。
PLoS Comput Biol. 2022 Apr 21;18(4):e1010066. doi: 10.1371/journal.pcbi.1010066. eCollection 2022 Apr.
2
Cultivation-independent genomes greatly expand taxonomic-profiling capabilities of mOTUs across various environments.非培养基因组极大地扩展了 mOTU 在各种环境中的分类鉴定能力。
Microbiome. 2022 Dec 5;10(1):212. doi: 10.1186/s40168-022-01410-z.
3
Updating Urinary Microbiome Analyses to Enhance Biologic Interpretation.更新尿液微生物组分析以增强生物学解释。
Front Cell Infect Microbiol. 2022 Jul 8;12:789439. doi: 10.3389/fcimb.2022.789439. eCollection 2022.
4
Machine learning to predict microbial community functions: An analysis of dissolved organic carbon from litter decomposition.机器学习预测微生物群落功能:对凋落叶分解过程中溶解有机碳的分析。
PLoS One. 2019 Jul 1;14(7):e0215502. doi: 10.1371/journal.pone.0215502. eCollection 2019.
5
Ultrafast and accurate 16S rRNA microbial community analysis using Kraken 2.使用 Kraken 2 进行快速准确的 16S rRNA 微生物群落分析。
Microbiome. 2020 Aug 28;8(1):124. doi: 10.1186/s40168-020-00900-2.
6
Exploring the Hospital Microbiome by High-Resolution 16S rRNA Profiling.通过高分辨率 16S rRNA 分析探究医院微生物组。
Int J Mol Sci. 2019 Jun 25;20(12):3099. doi: 10.3390/ijms20123099.
7
Taxonomic annotation of 16S rRNA sequences of pig intestinal samples using MG-RAST and QIIME2 generated different microbiota compositions.使用 MG-RAST 和 QIIME2 对猪肠道样本的 16S rRNA 序列进行分类注释产生了不同的微生物群落组成。
J Microbiol Methods. 2021 Jul;186:106235. doi: 10.1016/j.mimet.2021.106235. Epub 2021 May 8.
8
A powerful microbiome-based association test and a microbial taxa discovery framework for comprehensive association mapping.基于宏基因组关联测试和微生物分类群发现框架的全面关联图谱分析。
Microbiome. 2017 Apr 24;5(1):45. doi: 10.1186/s40168-017-0262-x.
9
The Human Oral Microbiome Database: a web accessible resource for investigating oral microbe taxonomic and genomic information.人类口腔微生物组数据库:一个可访问的网络资源,用于研究口腔微生物的分类和基因组信息。
Database (Oxford). 2010 Jul 6;2010:baq013. doi: 10.1093/database/baq013.
10
Faecal microbiome sequences in relation to the egg-laying performance of hens using amplicon-based metagenomic association analysis.基于扩增子的宏基因组关联分析与母鸡产蛋性能相关的粪便微生物组序列。
Animal. 2020 Apr;14(4):706-715. doi: 10.1017/S1751731119002428. Epub 2019 Oct 17.

引用本文的文献

1
Exploring the role of normalization and feature selection in microbiome disease classification pipelines.探索标准化和特征选择在微生物组疾病分类流程中的作用。
Gigascience. 2025 Jan 6;14. doi: 10.1093/gigascience/giaf096.
2
Human DNA levels in feces reflect gut inflammation and associate with presence of gut species in IBD patients across the age spectrum.粪便中的人类DNA水平反映肠道炎症,并与各年龄段炎症性肠病(IBD)患者肠道菌群的存在相关。
Res Sq. 2025 Jul 7:rs.3.rs-6809327. doi: 10.21203/rs.3.rs-6809327/v1.
3
Elementary methods provide more replicable results in microbial differential abundance analysis.

本文引用的文献

1
Powerful and robust non-parametric association testing for microbiome data via a zero-inflated quantile approach (ZINQ).基于零膨胀分位数方法(ZINQ)的微生物组数据强大而稳健的非参数关联检验。
Microbiome. 2021 Sep 2;9(1):181. doi: 10.1186/s40168-021-01129-3.
2
Comparative study of classifiers for human microbiome data.人类微生物组数据分类器的比较研究
Med Microecol. 2020 Jun;4. doi: 10.1016/j.medmic.2020.100013. Epub 2020 May 11.
3
Integrating taxonomic, functional, and strain-level profiling of diverse microbial communities with bioBakery 3.
在微生物差异丰度分析中,基本方法能提供更具可重复性的结果。
Brief Bioinform. 2025 Mar 4;26(2). doi: 10.1093/bib/bbaf130.
4
Quantifying uncertainty in microbiome-based prediction using Gaussian processes with microbial community dissimilarities.使用具有微生物群落差异的高斯过程量化基于微生物组预测中的不确定性。
Bioinform Adv. 2025 Mar 11;5(1):vbaf045. doi: 10.1093/bioadv/vbaf045. eCollection 2025.
5
Evaluating changes in attractor sets under small network perturbations to infer reliable microbial interaction networks from abundance patterns.评估小网络扰动下吸引子集的变化,以从丰度模式推断可靠的微生物相互作用网络。
Bioinformatics. 2025 Mar 29;41(4). doi: 10.1093/bioinformatics/btaf095.
6
Personalized prediction of glycemic responses to food in women with diet-treated gestational diabetes: the role of the gut microbiota.饮食治疗的妊娠期糖尿病女性对食物血糖反应的个性化预测:肠道微生物群的作用
NPJ Biofilms Microbiomes. 2025 Feb 7;11(1):25. doi: 10.1038/s41522-025-00650-9.
7
Effects of data transformation and model selection on feature importance in microbiome classification data.数据转换和模型选择对微生物组分类数据中特征重要性的影响。
Microbiome. 2025 Jan 4;13(1):2. doi: 10.1186/s40168-024-01996-6.
8
Leveraging human microbiomes for disease prediction and treatment.利用人类微生物群进行疾病预测和治疗。
Trends Pharmacol Sci. 2025 Jan;46(1):32-44. doi: 10.1016/j.tips.2024.11.007. Epub 2024 Dec 27.
9
MaAsLin 3: Refining and extending generalized multivariable linear models for meta-omic association discovery.MaAsLin 3:改进和扩展用于宏基因组关联发现的广义多变量线性模型。
bioRxiv. 2024 Dec 14:2024.12.13.628459. doi: 10.1101/2024.12.13.628459.
10
MicroHDF: predicting host phenotypes with metagenomic data using a deep forest-based framework.MicroHDF:基于深度森林框架利用宏基因组数据预测宿主表型。
Brief Bioinform. 2024 Sep 23;25(6). doi: 10.1093/bib/bbae530.
利用 bioBakery 3 整合具有分类学、功能和菌株水平特征的多样化微生物群落。
Elife. 2021 May 4;10:e65088. doi: 10.7554/eLife.65088.
4
Microbiome meta-analysis and cross-disease comparison enabled by the SIAMCAT machine learning toolbox.通过 SIAMCAT 机器学习工具箱进行微生物组荟萃分析和跨疾病比较。
Genome Biol. 2021 Mar 30;22(1):93. doi: 10.1186/s13059-021-02306-1.
5
Applications of Machine Learning in Human Microbiome Studies: A Review on Feature Selection, Biomarker Identification, Disease Prediction and Treatment.机器学习在人类微生物组研究中的应用:特征选择、生物标志物识别、疾病预测与治疗综述
Front Microbiol. 2021 Feb 19;12:634511. doi: 10.3389/fmicb.2021.634511. eCollection 2021.
6
Statistical and Machine Learning Techniques in Human Microbiome Studies: Contemporary Challenges and Solutions.人类微生物组研究中的统计和机器学习技术:当代挑战与解决方案
Front Microbiol. 2021 Feb 22;12:635781. doi: 10.3389/fmicb.2021.635781. eCollection 2021.
7
Harnessing machine learning for development of microbiome therapeutics.利用机器学习开发微生物组治疗方法。
Gut Microbes. 2021 Jan-Dec;13(1):1-20. doi: 10.1080/19490976.2021.1872323.
8
is elevated in neuromyelitis optica spectrum disorder in India and shares sequence similarity with AQP4.在印度,视神经脊髓炎谱系疾病患者的该抗体水平升高,且与水通道蛋白 4 具有序列相似性。
Neurol Neuroimmunol Neuroinflamm. 2020 Nov 4;8(1). doi: 10.1212/NXI.0000000000000907. Print 2021 Jan.
9
Strong oral plaque microbiome signatures for dental implant diseases identified by strain-resolution metagenomics.通过基于株分辨率的宏基因组学鉴定,发现口腔斑块微生物组特征与种植牙疾病密切相关。
NPJ Biofilms Microbiomes. 2020 Oct 30;6(1):47. doi: 10.1038/s41522-020-00155-7.
10
IDMIL: an alignment-free Interpretable Deep Multiple Instance Learning (MIL) for predicting disease from whole-metagenomic data.IDMIL:一种无对齐的可解释深度多重实例学习(MIL)方法,用于从全宏基因组数据预测疾病。
Bioinformatics. 2020 Jul 1;36(Suppl_1):i39-i47. doi: 10.1093/bioinformatics/btaa477.