• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

Phy-PMRFI:基于随机森林特征重要性的宏基因组功能预测的系统发育感知方法

Phy-PMRFI: Phylogeny-Aware Prediction of Metagenomic Functions Using Random Forest Feature Importance.

出版信息

IEEE Trans Nanobioscience. 2019 Jul;18(3):273-282. doi: 10.1109/TNB.2019.2912824. Epub 2019 Apr 24.

DOI:10.1109/TNB.2019.2912824
PMID:31021803
Abstract

High-throughput sequencing techniques have accelerated functional metagenomics studies through the generation of large volumes of omics data. The integration of these data using computational approaches is potentially useful for predicting metagenomic functions. Machine learning (ML) models can be trained using microbial features which are then used to classify microbial data into different functional classes. For example, ML analyses over the human microbiome data has been linked to the prediction of important biological states. For analysing omics data, integrating abundance count of taxonomical features with their biological relationships is important. These relationships can potentially be uncovered from the phylogenetic tree of microbial taxa. In this paper, we propose a novel integrative framework Phy-PMRFI. This framework is driven by the phylogeny-based modeling of omics data to predict metagenomic functions using important features selected by a random forest importance (RFI) strategy. The proposed framework integrates the underlying phylogenetic tree information with abundance measures of microbial species (features) by creating a novel phylogeny and abundance aware matrix structure (PAAM). Phy-PMRFI progresses by ranking the microbial features using an RFI measure. This is then used as input for microbiome classification. The resultant feature set enhances the performance of the state-of-art methods such as support vector machines. Our proposed integrative framework also outperforms the state-of-the-art pipeline of phylogenetic isometric log-ratio transform (PhILR) and MetaPhyl. Prediction accuracy of 90 % is obtained with Phy-PMRFI over human throat microbiome in comparison to other approaches of PhILR with 53% and MetaPhyl with 71% accuracy.

摘要

高通量测序技术通过生成大量组学数据,加速了功能宏基因组学研究。通过计算方法整合这些数据,对于预测宏基因组功能可能是有用的。可以使用微生物特征来训练机器学习 (ML) 模型,然后将微生物数据分类到不同的功能类别中。例如,对人类微生物组数据的 ML 分析已与重要生物状态的预测相关联。为了分析组学数据,将分类特征的丰度计数与其生物关系整合起来很重要。这些关系可以从微生物分类群的系统发育树中揭示出来。在本文中,我们提出了一种新的综合框架 Phy-PMRFI。该框架由基于系统发育的组学数据建模驱动,使用随机森林重要性 (RFI) 策略选择的重要特征来预测宏基因组功能。该框架通过创建一个新的系统发育和丰度感知矩阵结构 (PAAM) 将潜在的系统发育树信息与微生物物种的丰度度量 (特征) 集成在一起。Phy-PMRFI 通过使用 RFI 度量对微生物特征进行排序来推进。然后将其用作微生物组分类的输入。所得特征集增强了支持向量机等最先进方法的性能。我们提出的综合框架也优于最先进的系统发育等距对数比变换 (PhILR) 和 MetaPhyl 管道。与 PhILR 的 53%和 MetaPhyl 的 71%相比,Phy-PMRFI 在人类喉咙微生物组上的预测准确率为 90%。

相似文献

1
Phy-PMRFI: Phylogeny-Aware Prediction of Metagenomic Functions Using Random Forest Feature Importance.Phy-PMRFI:基于随机森林特征重要性的宏基因组功能预测的系统发育感知方法
IEEE Trans Nanobioscience. 2019 Jul;18(3):273-282. doi: 10.1109/TNB.2019.2912824. Epub 2019 Apr 24.
2
Developing a New Phylogeny-Driven Random Forest Model for Functional Metagenomics.开发一种新的基于系统发育的随机森林模型用于功能宏基因组学。
IEEE Trans Nanobioscience. 2023 Oct;22(4):763-770. doi: 10.1109/TNB.2023.3283462. Epub 2023 Oct 3.
3
Explaining diversity in metagenomic datasets by phylogenetic-based feature weighting.通过基于系统发育的特征加权解释宏基因组数据集的多样性。
PLoS Comput Biol. 2015 Mar 27;11(3):e1004186. doi: 10.1371/journal.pcbi.1004186. eCollection 2015 Mar.
4
Phylogeny-based classification of microbial communities.基于系统发育的微生物群落分类。
Bioinformatics. 2014 Feb 15;30(4):449-56. doi: 10.1093/bioinformatics/btt700. Epub 2013 Dec 24.
5
Massive metagenomic data analysis using abundance-based machine learning.基于丰度的机器学习在海量宏基因组数据分析中的应用。
Biol Direct. 2019 Aug 1;14(1):12. doi: 10.1186/s13062-019-0242-0.
6
Exploiting topic modeling to boost metagenomic reads binning.利用主题建模来促进宏基因组读数分箱。
BMC Bioinformatics. 2015;16 Suppl 5(Suppl 5):S2. doi: 10.1186/1471-2105-16-S5-S2. Epub 2015 Mar 18.
7
Piphillin: Improved Prediction of Metagenomic Content by Direct Inference from Human Microbiomes.Piphillin:通过直接从人类微生物组进行推断改进宏基因组内容预测
PLoS One. 2016 Nov 7;11(11):e0166104. doi: 10.1371/journal.pone.0166104. eCollection 2016.
8
Re-purposing software for functional characterization of the microbiome.重新利用软件对微生物组进行功能特征分析。
Microbiome. 2021 Jan 9;9(1):4. doi: 10.1186/s40168-020-00971-1.
9
MetaNN: accurate classification of host phenotypes from metagenomic data using neural networks.MetaNN:使用神经网络对宏基因组数据进行宿主表型的精确分类。
BMC Bioinformatics. 2019 Jun 20;20(Suppl 12):314. doi: 10.1186/s12859-019-2833-2.
10
PopPhy-CNN: A Phylogenetic Tree Embedded Architecture for Convolutional Neural Networks to Predict Host Phenotype From Metagenomic Data.PopPhy-CNN:一种将系统发生树嵌入到卷积神经网络中的架构,用于从宏基因组数据中预测宿主表型。
IEEE J Biomed Health Inform. 2020 Oct;24(10):2993-3001. doi: 10.1109/JBHI.2020.2993761. Epub 2020 May 11.

引用本文的文献

1
Artificial Intelligence: A Promising Tool in Exploring the Phytomicrobiome in Managing Disease and Promoting Plant Health.人工智能:探索植物微生物组以管理疾病和促进植物健康的一种有前景的工具。
Plants (Basel). 2023 Apr 30;12(9):1852. doi: 10.3390/plants12091852.
2
Application of Deep Learning in Plant-Microbiota Association Analysis.深度学习在植物-微生物群关联分析中的应用。
Front Genet. 2021 Oct 8;12:697090. doi: 10.3389/fgene.2021.697090. eCollection 2021.
3
Applications of Machine Learning in Human Microbiome Studies: A Review on Feature Selection, Biomarker Identification, Disease Prediction and Treatment.
机器学习在人类微生物组研究中的应用:特征选择、生物标志物识别、疾病预测与治疗综述
Front Microbiol. 2021 Feb 19;12:634511. doi: 10.3389/fmicb.2021.634511. eCollection 2021.
4
HMMPred: Accurate Prediction of DNA-Binding Proteins Based on HMM Profiles and XGBoost Feature Selection.HMMPred:基于 HMM 轮廓和 XGBoost 特征选择的 DNA 结合蛋白精确预测。
Comput Math Methods Med. 2020 Mar 28;2020:1384749. doi: 10.1155/2020/1384749. eCollection 2020.