• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

宏基因组数据多分类方法的综合评价。

A comprehensive evaluation of multicategory classification methods for microbiomic data.

机构信息

Center for Health Informatics and Bioinformatics, New York University Langone Medical Center, 227 East 30th Street, New York, NY, USA.

出版信息

Microbiome. 2013 Apr 5;1(1):11. doi: 10.1186/2049-2618-1-11.

DOI:10.1186/2049-2618-1-11
PMID:24456583
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3960509/
Abstract

BACKGROUND

Recent advances in next-generation DNA sequencing enable rapid high-throughput quantitation of microbial community composition in human samples, opening up a new field of microbiomics. One of the promises of this field is linking abundances of microbial taxa to phenotypic and physiological states, which can inform development of new diagnostic, personalized medicine, and forensic modalities. Prior research has demonstrated the feasibility of applying machine learning methods to perform body site and subject classification with microbiomic data. However, it is currently unknown which classifiers perform best among the many available alternatives for classification with microbiomic data.

RESULTS

In this work, we performed a systematic comparison of 18 major classification methods, 5 feature selection methods, and 2 accuracy metrics using 8 datasets spanning 1,802 human samples and various classification tasks: body site and subject classification and diagnosis.

CONCLUSIONS

We found that random forests, support vector machines, kernel ridge regression, and Bayesian logistic regression with Laplace priors are the most effective machine learning techniques for performing accurate classification from these microbiomic data.

摘要

背景

新一代 DNA 测序技术的进步使得对人体样本中微生物群落组成进行快速高通量定量成为可能,开辟了微生物组学的新领域。该领域的一个承诺是将微生物分类群的丰度与表型和生理状态联系起来,从而为开发新的诊断、个性化医学和法医模式提供信息。先前的研究已经证明了应用机器学习方法对微生物组学数据进行体部位和个体分类的可行性。然而,目前尚不清楚在用于微生物组学数据分类的众多替代方案中,哪种分类器的性能最好。

结果

在这项工作中,我们使用 8 个数据集(涵盖 1802 个人体样本)和各种分类任务(体部位和个体分类和诊断),对 18 种主要分类方法、5 种特征选择方法和 2 种准确性指标进行了系统比较。

结论

我们发现随机森林、支持向量机、核脊回归和贝叶斯逻辑回归与拉普拉斯先验是从这些微生物组学数据中进行准确分类的最有效机器学习技术。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1f8c/3960509/0d48470a3807/2049-2618-1-11-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1f8c/3960509/0d48470a3807/2049-2618-1-11-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1f8c/3960509/0d48470a3807/2049-2618-1-11-1.jpg

相似文献

1
A comprehensive evaluation of multicategory classification methods for microbiomic data.宏基因组数据多分类方法的综合评价。
Microbiome. 2013 Apr 5;1(1):11. doi: 10.1186/2049-2618-1-11.
2
A comprehensive evaluation of multicategory classification methods for microarray gene expression cancer diagnosis.用于微阵列基因表达癌症诊断的多类别分类方法的综合评估。
Bioinformatics. 2005 Mar 1;21(5):631-43. doi: 10.1093/bioinformatics/bti033. Epub 2004 Sep 16.
3
Machine learning algorithms for outcome prediction in (chemo)radiotherapy: An empirical comparison of classifiers.机器学习算法在(放化疗)治疗结果预测中的应用:分类器的实证比较。
Med Phys. 2018 Jul;45(7):3449-3459. doi: 10.1002/mp.12967. Epub 2018 Jun 13.
4
Data mining methods in the prediction of Dementia: A real-data comparison of the accuracy, sensitivity and specificity of linear discriminant analysis, logistic regression, neural networks, support vector machines, classification trees and random forests.痴呆预测中的数据挖掘方法:线性判别分析、逻辑回归、神经网络、支持向量机、分类树和随机森林在准确性、敏感性和特异性方面的实际数据比较。
BMC Res Notes. 2011 Aug 17;4:299. doi: 10.1186/1756-0500-4-299.
5
Gene targeting in amyotrophic lateral sclerosis using causality-based feature selection and machine learning.使用基于因果关系的特征选择和机器学习进行肌萎缩侧索硬化症的基因靶向治疗。
Mol Med. 2023 Jan 24;29(1):12. doi: 10.1186/s10020-023-00603-y.
6
A comprehensive comparison of random forests and support vector machines for microarray-based cancer classification.基于微阵列的癌症分类中随机森林与支持向量机的全面比较
BMC Bioinformatics. 2008 Jul 22;9:319. doi: 10.1186/1471-2105-9-319.
7
Feature-Free Activity Classification of Inertial Sensor Data With Machine Vision Techniques: Method, Development, and Evaluation.基于机器视觉技术的惯性传感器数据无特征活动分类:方法、开发与评估
JMIR Mhealth Uhealth. 2017 Aug 4;5(8):e115. doi: 10.2196/mhealth.7521.
8
Ecologically informed microbial biomarkers and accurate classification of mixed and unmixed samples in an extensive cross-study of human body sites.在对人体部位进行广泛的跨研究中,具有生态意义的微生物生物标志物和对混合及未混合样本的准确分类。
Microbiome. 2018 Oct 24;6(1):192. doi: 10.1186/s40168-018-0565-6.
9
Multicategory Large-Margin Unified Machines.多类别大间隔统一机器
J Mach Learn Res. 2013 May 1;14:1349-1386.
10
Comparison of feature selection and classification for MALDI-MS data.基质辅助激光解吸电离飞行时间质谱(MALDI-MS)数据的特征选择与分类比较
BMC Genomics. 2009 Jul 7;10 Suppl 1(Suppl 1):S3. doi: 10.1186/1471-2164-10-S1-S3.

引用本文的文献

1
Unpacking fitness differences between two invaders in a multispecies context.剖析多物种背景下两种入侵物种的适应性差异
Bull Math Biol. 2025 Jul 26;87(9):120. doi: 10.1007/s11538-025-01491-5.
2
Multi-omic analysis reveals transkingdom gut dysbiosis in metabolic dysfunction-associated steatotic liver disease.多组学分析揭示代谢功能障碍相关脂肪性肝病中的跨界肠道生态失调。
Nat Metab. 2025 Jul 2. doi: 10.1038/s42255-025-01318-6.
3
Differential intestinal microbiome response to heat stress in two rabbit maternal lines: a comparative analysis using Random Forest, BayesC, and PLS-DA.

本文引用的文献

1
Supervised classification of human microbiota.人类微生物群落的监督分类。
FEMS Microbiol Rev. 2011 Mar;35(2):343-59. doi: 10.1111/j.1574-6976.2010.00251.x. Epub 2010 Oct 7.
2
Search and clustering orders of magnitude faster than BLAST.比 BLAST 快几个数量级的搜索和聚类。
Bioinformatics. 2010 Oct 1;26(19):2460-1. doi: 10.1093/bioinformatics/btq461. Epub 2010 Aug 12.
3
QIIME allows analysis of high-throughput community sequencing data.QIIME可用于分析高通量群落测序数据。
两个家兔母系中肠道微生物群对热应激的差异反应:使用随机森林、贝叶斯C和偏最小二乘判别分析的比较分析
J Anim Sci. 2025 Jan 4;103. doi: 10.1093/jas/skaf206.
4
Harnessing vaginal inflammation and microbiome: a machine learning model for predicting IVF success.利用阴道炎症和微生物群:一种预测体外受精成功率的机器学习模型。
NPJ Biofilms Microbiomes. 2025 Jun 5;11(1):95. doi: 10.1038/s41522-025-00732-8.
5
Fecal bacterial biomarkers and blood biochemical indicators as potential key factors in the development of colorectal cancer.粪便细菌生物标志物和血液生化指标作为结直肠癌发生发展的潜在关键因素。
mSystems. 2025 Mar 18;10(3):e0004325. doi: 10.1128/msystems.00043-25. Epub 2025 Feb 27.
6
Multi-omic profiling a defined bacterial consortium for treatment of recurrent Clostridioides difficile infection.对用于治疗复发性艰难梭菌感染的特定细菌群落进行多组学分析。
Nat Med. 2025 Jan;31(1):223-234. doi: 10.1038/s41591-024-03337-4. Epub 2025 Jan 2.
7
SPARTA: Interpretable functional classification of microbiomes and detection of hidden cumulative effects.斯巴达:微生物群落的可解释功能分类及隐藏累积效应的检测
PLoS Comput Biol. 2024 Nov 18;20(11):e1012577. doi: 10.1371/journal.pcbi.1012577. eCollection 2024 Nov.
8
DeepPhylo: Phylogeny-Aware Microbial Embeddings Enhanced Predictive Accuracy in Human Microbiome Data Analysis.深度系统发育分析:系统发育感知微生物嵌入增强了人类微生物组数据分析中的预测准确性。
Adv Sci (Weinh). 2024 Dec;11(45):e2404277. doi: 10.1002/advs.202404277. Epub 2024 Oct 15.
9
Exploring and exploiting the rice phytobiome to tackle climate change challenges.探索和利用水稻植物微生物群以应对气候变化挑战。
Plant Commun. 2024 Dec 9;5(12):101078. doi: 10.1016/j.xplc.2024.101078. Epub 2024 Sep 3.
10
Effect of metformin and metformin/linagliptin on gut microbiota in patients with prediabetes.二甲双胍和二甲双胍/利拉利汀对糖尿病前期患者肠道微生物群的影响。
Sci Rep. 2024 Apr 27;14(1):9678. doi: 10.1038/s41598-024-60081-y.
Nat Methods. 2010 May;7(5):335-6. doi: 10.1038/nmeth.f.303. Epub 2010 Apr 11.
4
Forensic identification using skin bacterial communities.利用皮肤细菌群落进行法医鉴定。
Proc Natl Acad Sci U S A. 2010 Apr 6;107(14):6477-81. doi: 10.1073/pnas.1000162107. Epub 2010 Mar 15.
5
Bacterial community variation in human body habitats across space and time.人体不同空间和时间栖息地的细菌群落变化。
Science. 2009 Dec 18;326(5960):1694-7. doi: 10.1126/science.1177486. Epub 2009 Nov 5.
6
Probabilistic neural networks and the polynomial Adaline as complementary techniques for classification.概率神经网络与多项式Adaline作为分类的互补技术。
IEEE Trans Neural Netw. 1990;1(1):111-21. doi: 10.1109/72.80210.
7
Critical review of published microarray studies for cancer outcome and guidelines on statistical analysis and reporting.已发表的癌症预后微阵列研究的批判性综述以及统计分析与报告指南。
J Natl Cancer Inst. 2007 Jan 17;99(2):147-57. doi: 10.1093/jnci/djk018.
8
Gene selection and classification of microarray data using random forest.使用随机森林进行微阵列数据的基因选择与分类
BMC Bioinformatics. 2006 Jan 6;7:3. doi: 10.1186/1471-2105-7-3.
9
GEMS: a system for automated cancer diagnosis and biomarker discovery from microarray gene expression data.GEMS:一种用于从微阵列基因表达数据中进行癌症自动诊断和生物标志物发现的系统。
Int J Med Inform. 2005 Aug;74(7-8):491-503. doi: 10.1016/j.ijmedinf.2005.05.002.
10
A comprehensive evaluation of multicategory classification methods for microarray gene expression cancer diagnosis.用于微阵列基因表达癌症诊断的多类别分类方法的综合评估。
Bioinformatics. 2005 Mar 1;21(5):631-43. doi: 10.1093/bioinformatics/bti033. Epub 2004 Sep 16.