MetaPheno：基于宏基因组的疾病预测中深度学习和机器学习的批判性评估。

MetaPheno: A critical evaluation of deep learning and machine learning in metagenome-based disease prediction.

机构信息

Department of Computer Science, University of California at Los Angeles, Los Angeles, CA 90095, USA.

出版信息

Methods. 2019 Aug 15;166:74-82. doi: 10.1016/j.ymeth.2019.03.003. Epub 2019 Mar 16.

DOI:10.1016/j.ymeth.2019.03.003

PMID:30885720

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6708502/

Abstract

The human microbiome plays a number of critical roles, impacting almost every aspect of human health and well-being. Conditions in the microbiome have been linked to a number of significant diseases. Additionally, revolutions in sequencing technology have led to a rapid increase in publicly-available sequencing data. Consequently, there have been growing efforts to predict disease status from metagenomic sequencing data, with a proliferation of new approaches in the last few years. Some of these efforts have explored utilizing a powerful form of machine learning called deep learning, which has been applied successfully in several biological domains. Here, we review some of these methods and the algorithms that they are based on, with a particular focus on deep learning methods. We also perform a deeper analysis of Type 2 Diabetes and obesity datasets that have eluded improved results, using a variety of machine learning and feature extraction methods. We conclude by offering perspectives on study design considerations that may impact results and future directions the field can take to improve results and offer more valuable conclusions. The scripts and extracted features for the analyses conducted in this paper are available via GitHub:https://github.com/nlapier2/metapheno.

摘要

人类微生物组发挥着许多关键作用，几乎影响到人类健康和福祉的各个方面。微生物组中的情况与许多重大疾病有关。此外，测序技术的革命导致了可公开获得的测序数据的快速增加。因此，人们越来越努力地从宏基因组测序数据中预测疾病状态，在过去几年中出现了许多新方法。其中一些研究探索了利用一种称为深度学习的强大机器学习形式，该技术已在多个生物领域得到成功应用。在这里，我们回顾了其中一些方法以及它们所基于的算法，特别关注深度学习方法。我们还使用各种机器学习和特征提取方法，对 2 型糖尿病和肥胖症数据集进行了更深入的分析，这些数据集的结果仍有待提高。最后，我们提供了一些关于可能影响结果的研究设计注意事项的观点，以及该领域可以采取哪些措施来提高结果并提供更有价值的结论。本文中进行的分析的脚本和提取的特征可在 GitHub 上获得：https://github.com/nlapier2/metapheno。

相似文献

MetaPheno: A critical evaluation of deep learning and machine learning in metagenome-based disease prediction.

Methods. 2019 Aug 15;166:74-82. doi: 10.1016/j.ymeth.2019.03.003. Epub 2019 Mar 16.

Multimodal deep learning applied to classify healthy and disease states of human microbiome.

Sci Rep. 2022 Jan 17;12(1):824. doi: 10.1038/s41598-022-04773-3.

Gene-based microbiome representation enhances host phenotype classification.

mSystems. 2023 Aug 31;8(4):e0053123. doi: 10.1128/msystems.00531-23. Epub 2023 Jul 5.

Machine Learning Meta-analysis of Large Metagenomic Datasets: Tools and Biological Insights.

PLoS Comput Biol. 2016 Jul 11;12(7):e1004977. doi: 10.1371/journal.pcbi.1004977. eCollection 2016 Jul.

Metagenomics Biomarkers Selected for Prediction of Three Different Diseases in Chinese Population.

Biomed Res Int. 2018 Jan 11;2018:2936257. doi: 10.1155/2018/2936257. eCollection 2018.

EnsDeepDP: An Ensemble Deep Learning Approach for Disease Prediction Through Metagenomics.

IEEE/ACM Trans Comput Biol Bioinform. 2023 Mar-Apr;20(2):986-998. doi: 10.1109/TCBB.2022.3201295. Epub 2023 Apr 3.

Massive metagenomic data analysis using abundance-based machine learning.

Biol Direct. 2019 Aug 1;14(1):12. doi: 10.1186/s13062-019-0242-0.

Automatic disease prediction from human gut metagenomic data using boosting GraphSAGE.

BMC Bioinformatics. 2023 Mar 31;24(1):126. doi: 10.1186/s12859-023-05251-x.

Systematic evaluation of supervised machine learning for sample origin prediction using metagenomic sequencing data.

Biol Direct. 2020 Dec 10;15(1):29. doi: 10.1186/s13062-020-00287-y.

Prediction of interresidue contacts with DeepMetaPSICOV in CASP13.

Proteins. 2019 Dec;87(12):1092-1099. doi: 10.1002/prot.25779. Epub 2019 Jul 27.

引用本文的文献

Machine learning based gut microbiota pattern and response to fiber as a diagnostic tool for chronic inflammatory diseases.

BMC Microbiol. 2025 Jun 6;25(1):353. doi: 10.1186/s12866-025-04072-7.

Human Papillomavirus, Human Immunodeficiency Virus, and Oral Microbiota Interplay in Nigerian Youth (HOMINY): A Prospective Cohort Study Protocol.

BMJ Open. 2025 Feb 8;15(2):e091017. doi: 10.1136/bmjopen-2024-091017.

Deep learning in microbiome analysis: a comprehensive review of neural network models.

Front Microbiol. 2025 Jan 22;15:1516667. doi: 10.3389/fmicb.2024.1516667. eCollection 2024.

A survey of k-mer methods and applications in bioinformatics.

Comput Struct Biotechnol J. 2024 May 21;23:2289-2303. doi: 10.1016/j.csbj.2024.05.025. eCollection 2024 Dec.

The Role and Applications of Artificial Intelligence in the Treatment of Chronic Pain.

Curr Pain Headache Rep. 2024 Aug;28(8):769-784. doi: 10.1007/s11916-024-01264-0. Epub 2024 Jun 1.

Deep learning methods in metagenomics: a review.

Microb Genom. 2024 Apr;10(4). doi: 10.1099/mgen.0.001231.

phylaGAN: data augmentation through conditional GANs and autoencoders for improving disease prediction accuracy using microbiome data.

Bioinformatics. 2024 Mar 29;40(4). doi: 10.1093/bioinformatics/btae161.

Reference-free Structural Variant Detection in Microbiomes via Long-read Coassembly Graphs.

bioRxiv. 2024 Jan 30:2024.01.25.577285. doi: 10.1101/2024.01.25.577285.

Model-free prediction of microbiome compositions.

Microbiome. 2024 Feb 1;12(1):17. doi: 10.1186/s40168-023-01721-9.

GDmicro: classifying host disease status with GCN and deep adaptation network based on the human gut microbiome data.

Bioinformatics. 2023 Dec 1;39(12). doi: 10.1093/bioinformatics/btad747.

本文引用的文献

PopPhy-CNN: A Phylogenetic Tree Embedded Architecture for Convolutional Neural Networks to Predict Host Phenotype From Metagenomic Data.

IEEE J Biomed Health Inform. 2020 Oct;24(10):2993-3001. doi: 10.1109/JBHI.2020.2993761. Epub 2020 May 11.

Flexible design of multiple metagenomics classification pipelines with UGENE.

Bioinformatics. 2019 Jun 1;35(11):1963-1965. doi: 10.1093/bioinformatics/bty901.

A universal SNP and small-indel variant caller using deep neural networks.

Nat Biotechnol. 2018 Nov;36(10):983-987. doi: 10.1038/nbt.4235. Epub 2018 Sep 24.

MicroPheno: predicting environments and host phenotypes from 16S rRNA gene sequencing using a k-mer based representation of shallow sub-samples.

Bioinformatics. 2018 Jul 1;34(13):i32-i42. doi: 10.1093/bioinformatics/bty296.

Taxonomy-aware feature engineering for microbiome classification.

BMC Bioinformatics. 2018 Jun 15;19(1):227. doi: 10.1186/s12859-018-2205-3.

Association mapping from sequencing reads using -mers.

Elife. 2018 Jun 13;7:e32920. doi: 10.7554/eLife.32920.

Next-Generation Machine Learning for Biological Networks.

Cell. 2018 Jun 14;173(7):1581-1592. doi: 10.1016/j.cell.2018.05.015. Epub 2018 Jun 7.

Opportunities and obstacles for deep learning in biology and medicine.

J R Soc Interface. 2018 Apr;15(141). doi: 10.1098/rsif.2017.0387.

Data and Statistical Methods To Analyze the Human Microbiome.

mSystems. 2018 Mar 13;3(2). doi: 10.1128/mSystems.00194-17. eCollection 2018 Mar-Apr.

Environment dominates over host genetics in shaping human gut microbiota.

Nature. 2018 Mar 8;555(7695):210-215. doi: 10.1038/nature25973. Epub 2018 Feb 28.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

MetaPheno：基于宏基因组的疾病预测中深度学习和机器学习的批判性评估。

MetaPheno: A critical evaluation of deep learning and machine learning in metagenome-based disease prediction.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献