• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

MegaD:用于宏基因组样本疾病状态快速准确预测的深度学习

MegaD: Deep Learning for Rapid and Accurate Disease Status Prediction of Metagenomic Samples.

作者信息

Mreyoud Yassin, Song Myoungkyu, Lim Jihun, Ahn Tae-Hyuk

机构信息

Program in Bioinformatics and Computational Biology, Saint Louis University, Saint Louis, MO 63104, USA.

Department of Computer Science, University of Nebraska Omaha, Omaha, NE 68182, USA.

出版信息

Life (Basel). 2022 Apr 30;12(5):669. doi: 10.3390/life12050669.

DOI:10.3390/life12050669
PMID:35629336
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9143510/
Abstract

The diversity within different microbiome communities that drive biogeochemical processes influences many different phenotypes. Analyses of these communities and their diversity by countless microbiome projects have revealed an important role of metagenomics in understanding the complex relation between microbes and their environments. This relationship can be understood in the context of microbiome composition of specific known environments. These compositions can then be used as a template for predicting the status of similar environments. Machine learning has been applied as a key component to this predictive task. Several analysis tools have already been published utilizing machine learning methods for metagenomic analysis. Despite the previously proposed machine learning models, the performance of deep neural networks is still under-researched. Given the nature of metagenomic data, deep neural networks could provide a strong boost to growth in the prediction accuracy in metagenomic analysis applications. To meet this urgent demand, we present a deep learning based tool that utilizes a deep neural network implementation for phenotypic prediction of unknown metagenomic samples. (1) First, our tool takes as input taxonomic profiles from 16S or WGS sequencing data. (2) Second, given the samples, our tool builds a model based on a deep neural network by computing multi-level classification. (3) Lastly, given the model, our tool classifies an unknown sample with its unlabeled taxonomic profile. In the benchmark experiments, we deduced that an analysis method facilitating a deep neural network such as our tool can show promising results in increasing the prediction accuracy on several samples compared to other machine learning models.

摘要

驱动生物地球化学过程的不同微生物群落内部的多样性会影响许多不同的表型。无数微生物组项目对这些群落及其多样性进行的分析揭示了宏基因组学在理解微生物与其环境之间复杂关系方面的重要作用。这种关系可以在特定已知环境的微生物组组成的背景下得到理解。然后,这些组成可以用作预测相似环境状态的模板。机器学习已被用作这一预测任务的关键组成部分。已经发表了几种利用机器学习方法进行宏基因组分析的工具。尽管之前提出了机器学习模型,但深度神经网络的性能仍未得到充分研究。鉴于宏基因组数据的性质,深度神经网络可以极大地提高宏基因组分析应用中的预测准确性。为了满足这一迫切需求,我们提出了一种基于深度学习的工具,该工具利用深度神经网络实现对未知宏基因组样本的表型预测。(1)首先,我们的工具将16S或全基因组测序(WGS)数据中的分类学概况作为输入。(2)其次给定这些样本,我们的工具通过计算多级分类基于深度神经网络构建一个模型。(3)最后,给定该模型,我们的工具使用未标记的分类学概况对未知样本进行分类。在基准实验中,我们推断,像我们的工具这样便于使用深度神经网络的分析方法,与其他机器学习模型相比,在提高对多个样本的预测准确性方面可能会显示出有前景的结果。

相似文献

1
MegaD: Deep Learning for Rapid and Accurate Disease Status Prediction of Metagenomic Samples.MegaD:用于宏基因组样本疾病状态快速准确预测的深度学习
Life (Basel). 2022 Apr 30;12(5):669. doi: 10.3390/life12050669.
2
MegaR: an interactive R package for rapid sample classification and phenotype prediction using metagenome profiles and machine learning.MegaR:一个交互式 R 包,用于使用宏基因组谱和机器学习快速对样本进行分类和表型预测。
BMC Bioinformatics. 2021 Jan 18;22(1):25. doi: 10.1186/s12859-020-03933-4.
3
A permutable MLP-like architecture for disease prediction from gut metagenomic data.一种可置换的类似于多层感知机的架构,用于从肠道宏基因组数据中进行疾病预测。
BMC Bioinformatics. 2024 Jul 24;25(1):246. doi: 10.1186/s12859-024-05856-w.
4
Systematic evaluation of supervised machine learning for sample origin prediction using metagenomic sequencing data.基于宏基因组测序数据的样本来源预测的有监督机器学习方法的系统评价。
Biol Direct. 2020 Dec 10;15(1):29. doi: 10.1186/s13062-020-00287-y.
5
Massive metagenomic data analysis using abundance-based machine learning.基于丰度的机器学习在海量宏基因组数据分析中的应用。
Biol Direct. 2019 Aug 1;14(1):12. doi: 10.1186/s13062-019-0242-0.
6
Deep learning models for bacteria taxonomic classification of metagenomic data.基于深度学习的宏基因组数据细菌分类学分类模型
BMC Bioinformatics. 2018 Jul 9;19(Suppl 7):198. doi: 10.1186/s12859-018-2182-6.
7
Multi-Layer and Recursive Neural Networks for Metagenomic Classification.用于宏基因组分类的多层递归神经网络
IEEE Trans Nanobioscience. 2015 Sep;14(6):608-16. doi: 10.1109/TNB.2015.2461219. Epub 2015 Aug 24.
8
MetaNN: accurate classification of host phenotypes from metagenomic data using neural networks.MetaNN:使用神经网络对宏基因组数据进行宿主表型的精确分类。
BMC Bioinformatics. 2019 Jun 20;20(Suppl 12):314. doi: 10.1186/s12859-019-2833-2.
9
Automatic disease prediction from human gut metagenomic data using boosting GraphSAGE.基于提升图抽样的人类肠道宏基因组数据自动疾病预测。
BMC Bioinformatics. 2023 Mar 31;24(1):126. doi: 10.1186/s12859-023-05251-x.
10
PopPhy-CNN: A Phylogenetic Tree Embedded Architecture for Convolutional Neural Networks to Predict Host Phenotype From Metagenomic Data.PopPhy-CNN:一种将系统发生树嵌入到卷积神经网络中的架构,用于从宏基因组数据中预测宿主表型。
IEEE J Biomed Health Inform. 2020 Oct;24(10):2993-3001. doi: 10.1109/JBHI.2020.2993761. Epub 2020 May 11.

引用本文的文献

1
MSFT-transformer: a multistage fusion tabular transformer for disease prediction using metagenomic data.微软变压器:一种用于使用宏基因组数据进行疾病预测的多级融合表格变压器。
Brief Bioinform. 2025 May 1;26(3). doi: 10.1093/bib/bbaf217.
2
Bioinformatic approaches to blood and tissue microbiome analyses: challenges and perspectives.血液和组织微生物组分析的生物信息学方法:挑战与展望。
Brief Bioinform. 2025 Mar 4;26(2). doi: 10.1093/bib/bbaf176.
3
AI in microbiome-related healthcare.人工智能在微生物组相关医疗保健中的应用。

本文引用的文献

1
MegaR: an interactive R package for rapid sample classification and phenotype prediction using metagenome profiles and machine learning.MegaR:一个交互式 R 包,用于使用宏基因组谱和机器学习快速对样本进行分类和表型预测。
BMC Bioinformatics. 2021 Jan 18;22(1):25. doi: 10.1186/s12859-020-03933-4.
2
PopPhy-CNN: A Phylogenetic Tree Embedded Architecture for Convolutional Neural Networks to Predict Host Phenotype From Metagenomic Data.PopPhy-CNN:一种将系统发生树嵌入到卷积神经网络中的架构,用于从宏基因组数据中预测宿主表型。
IEEE J Biomed Health Inform. 2020 Oct;24(10):2993-3001. doi: 10.1109/JBHI.2020.2993761. Epub 2020 May 11.
3
Microb Biotechnol. 2024 Nov;17(11):e70027. doi: 10.1111/1751-7915.70027.
4
Deep learning methods in metagenomics: a review.元基因组学中的深度学习方法:综述。
Microb Genom. 2024 Apr;10(4). doi: 10.1099/mgen.0.001231.
5
Unveiling the Connection between Microbiota and Depressive Disorder through Machine Learning.通过机器学习揭示微生物群与抑郁障碍之间的关系。
Int J Mol Sci. 2023 Nov 17;24(22):16459. doi: 10.3390/ijms242216459.
DeepMicro: deep representation learning for disease prediction based on microbiome data.
深微:基于微生物组数据的疾病预测的深度学习表示。
Sci Rep. 2020 Apr 7;10(1):6026. doi: 10.1038/s41598-020-63159-5.
4
Improved metagenomic analysis with Kraken 2.Kraken 2 提升宏基因组分析。
Genome Biol. 2019 Nov 28;20(1):257. doi: 10.1186/s13059-019-1891-0.
5
Massive metagenomic data analysis using abundance-based machine learning.基于丰度的机器学习在海量宏基因组数据分析中的应用。
Biol Direct. 2019 Aug 1;14(1):12. doi: 10.1186/s13062-019-0242-0.
6
Reproducible, interactive, scalable and extensible microbiome data science using QIIME 2.使用QIIME 2进行可重复、交互式、可扩展和可延伸的微生物组数据科学研究。
Nat Biotechnol. 2019 Aug;37(8):852-857. doi: 10.1038/s41587-019-0209-9.
7
A comparative study of the gut microbiota in immune-mediated inflammatory diseases-does a common dysbiosis exist?免疫介导的炎症性疾病的肠道微生物组比较研究——是否存在共同的菌群失调?
Microbiome. 2018 Dec 13;6(1):221. doi: 10.1186/s40168-018-0603-4.
8
Shotgun metagenomics, from sampling to analysis. shotgun 宏基因组学,从采样到分析。
Nat Biotechnol. 2017 Sep 12;35(9):833-844. doi: 10.1038/nbt.3935.
9
Precision Metagenomics: Rapid Metagenomic Analyses for Infectious Disease Diagnostics and Public Health Surveillance.精准宏基因组学:用于传染病诊断和公共卫生监测的快速宏基因组分析
J Biomol Tech. 2017 Apr;28(1):40-45. doi: 10.7171/jbt.17-2801-007. Epub 2017 Mar 21.
10
An informative approach on differential abundance analysis for time-course metagenomic sequencing data.一种针对时间序列宏基因组测序数据的差异丰度分析的信息性方法。
Bioinformatics. 2017 May 1;33(9):1286-1292. doi: 10.1093/bioinformatics/btw828.