• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于代谢物和通路特征组合预测代谢物的通路参与情况

Predicting The Pathway Involvement Of Metabolites Based on Combined Metabolite and Pathway Features.

作者信息

Huckvale Erik D, Moseley Hunter N B

机构信息

Markey Cancer Center, University of Kentucky, Lexington, KY 40506, USA.

Superfund Research Center, University of Kentucky, Lexington, KY 40506, USA.

出版信息

bioRxiv. 2024 Apr 2:2024.04.01.587582. doi: 10.1101/2024.04.01.587582.

DOI:10.1101/2024.04.01.587582
PMID:38617261
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11014601/
Abstract

A major limitation of most metabolomics datasets is the sparsity of pathway annotations of detected metabolites. It is common for less than half of identified metabolites in these datasets to have known metabolic pathway involvement. Trying to address this limitation, machine learning models have been developed to predict the association of a metabolite with a "pathway category", as defined by one of the metabolic knowledgebases like the Kyoto Encyclopedia of Gene and Genomes. Most of these models are implemented as a single binary classifier specific to a single pathway category, requiring a set of binary classifiers for generating predictions for multiple pathway categories. This single binary classifier per pathway category approach both multiplies the computational resources necessary for training while diluting the positive entries in gold standard datasets needed for training. To address the limitations of training separate classifiers, we propose a generalization of the metabolic pathway prediction problem using a single binary classifier that accepts both features representing a metabolite and features representing a generic pathway category and then predicts whether the given metabolite is involved in the corresponding pathway category. We demonstrate that this metabolite-pathway features-pair approach is not only competitive with the combined performance of training separate binary classifiers, but it outperforms the previous benchmark models.

摘要

大多数代谢组学数据集的一个主要局限性是检测到的代谢物的通路注释稀疏。在这些数据集中,通常只有不到一半的已鉴定代谢物参与已知的代谢途径。为了解决这一局限性,人们开发了机器学习模型来预测代谢物与“通路类别”的关联,这种关联由诸如京都基因与基因组百科全书等代谢知识库定义。这些模型大多被实现为特定于单个通路类别的单一二元分类器,需要一组二元分类器来为多个通路类别生成预测。这种每个通路类别使用单一二元分类器的方法既增加了训练所需的计算资源,又稀释了训练所需的金标准数据集中的正样本。为了解决训练单独分类器的局限性,我们提出了一种代谢途径预测问题的一般化方法,使用一个单一的二元分类器,该分类器既接受代表代谢物的特征,也接受代表通用通路类别的特征,然后预测给定的代谢物是否参与相应的通路类别。我们证明,这种代谢物-通路特征对方法不仅与训练单独二元分类器的综合性能具有竞争力,而且优于先前的基准模型。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/573f/11014601/16c1d5cc01e0/nihpp-2024.04.01.587582v1-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/573f/11014601/16c1d5cc01e0/nihpp-2024.04.01.587582v1-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/573f/11014601/16c1d5cc01e0/nihpp-2024.04.01.587582v1-f0001.jpg

相似文献

1
Predicting The Pathway Involvement Of Metabolites Based on Combined Metabolite and Pathway Features.基于代谢物和通路特征组合预测代谢物的通路参与情况
bioRxiv. 2024 Apr 2:2024.04.01.587582. doi: 10.1101/2024.04.01.587582.
2
Predicting the Pathway Involvement of Metabolites Based on Combined Metabolite and Pathway Features.基于代谢物和通路特征组合预测代谢物的通路参与情况
Metabolites. 2024 May 7;14(5):266. doi: 10.3390/metabo14050266.
3
Predicting the Pathway Involvement of Metabolites in Both Pathway Categories and Individual Pathways.预测代谢物在通路类别和单个通路中的通路参与情况。
bioRxiv. 2024 Aug 9:2024.08.07.607025. doi: 10.1101/2024.08.07.607025.
4
Predicting the Association of Metabolites with Both Pathway Categories and Individual Pathways.预测代谢物与通路类别及单个通路之间的关联。
Metabolites. 2024 Sep 21;14(9):510. doi: 10.3390/metabo14090510.
5
Benchmark Dataset for Training Machine Learning Models to Predict the Pathway Involvement of Metabolites.用于训练机器学习模型以预测代谢物途径参与情况的基准数据集。
Metabolites. 2023 Nov 1;13(11):1120. doi: 10.3390/metabo13111120.
6
Benchmark dataset for training machine learning models to predict the pathway involvement of metabolites.用于训练机器学习模型以预测代谢物途径参与情况的基准数据集。
bioRxiv. 2023 Oct 9:2023.10.03.560715. doi: 10.1101/2023.10.03.560715.
7
Predicting the Pathway Involvement of All Pathway and Associated Compound Entries Defined in the Kyoto Encyclopedia of Genes and Genomes.预测《京都基因与基因组百科全书》中定义的所有通路及相关化合物条目的通路参与情况。
Metabolites. 2024 Oct 27;14(11):582. doi: 10.3390/metabo14110582.
8
Predicting the Pathway Involvement of Compounds Annotated in the Reactome Knowledgebase.预测Reactome知识库中注释化合物的通路参与情况。
Metabolites. 2025 Mar 1;15(3):161. doi: 10.3390/metabo15030161.
9
A cautionary tale about properly vetting datasets used in supervised learning predicting metabolic pathway involvement.一个关于在监督学习中使用预测代谢途径参与的数据集进行适当验证的警示故事。
PLoS One. 2024 May 2;19(5):e0299583. doi: 10.1371/journal.pone.0299583. eCollection 2024.
10
Recognition of early and late stages of bladder cancer using metabolites and machine learning.利用代谢物和机器学习识别膀胱癌的早期和晚期。
Metabolomics. 2019 Jun 20;15(7):94. doi: 10.1007/s11306-019-1555-9.

本文引用的文献

1
A cautionary tale about properly vetting datasets used in supervised learning predicting metabolic pathway involvement.一个关于在监督学习中使用预测代谢途径参与的数据集进行适当验证的警示故事。
PLoS One. 2024 May 2;19(5):e0299583. doi: 10.1371/journal.pone.0299583. eCollection 2024.
2
md_harmonize: A Python Package for Atom-Level Harmonization of Public Metabolic Databases.md_harmonize:一个用于公共代谢数据库原子级协调的Python包。
Metabolites. 2023 Dec 17;13(12):1199. doi: 10.3390/metabo13121199.
3
Benchmark Dataset for Training Machine Learning Models to Predict the Pathway Involvement of Metabolites.
用于训练机器学习模型以预测代谢物途径参与情况的基准数据集。
Metabolites. 2023 Nov 1;13(11):1120. doi: 10.3390/metabo13111120.
4
KEGG for taxonomy-based analysis of pathways and genomes.KEGG 用于基于分类的途径和基因组分析。
Nucleic Acids Res. 2023 Jan 6;51(D1):D587-D592. doi: 10.1093/nar/gkac963.
5
Array programming with NumPy.使用 NumPy 进行数组编程。
Nature. 2020 Sep;585(7825):357-362. doi: 10.1038/s41586-020-2649-2. Epub 2020 Sep 16.
6
SciPy 1.0: fundamental algorithms for scientific computing in Python.SciPy 1.0:Python 中的科学计算基础算法。
Nat Methods. 2020 Mar;17(3):261-272. doi: 10.1038/s41592-019-0686-2. Epub 2020 Feb 3.
7
Toward understanding the origin and evolution of cellular organisms.为了理解细胞生物的起源和进化。
Protein Sci. 2019 Nov;28(11):1947-1951. doi: 10.1002/pro.3715. Epub 2019 Sep 9.
8
The MetaCyc database of metabolic pathways and enzymes.MetaCyc 数据库中的代谢途径和酶。
Nucleic Acids Res. 2018 Jan 4;46(D1):D633-D639. doi: 10.1093/nar/gkx935.
9
A Binary Classifier for Prediction of the Types of Metabolic Pathway of Chemicals.用于预测化学物质代谢途径类型的二元分类器。
Comb Chem High Throughput Screen. 2017;20(2):140-146. doi: 10.2174/1386207319666161215142130.
10
KEGG: kyoto encyclopedia of genes and genomes.京都基因与基因组百科全书(KEGG)
Nucleic Acids Res. 2000 Jan 1;28(1):27-30. doi: 10.1093/nar/28.1.27.