• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

预测代谢物在通路类别和单个通路中的通路参与情况。

Predicting the Pathway Involvement of Metabolites in Both Pathway Categories and Individual Pathways.

作者信息

Huckvale Erik D, Moseley Hunter N B

机构信息

Markey Cancer Center, University of Kentucky, Lexington, KY, USA.

Superfund Research Center, University of Kentucky, Lexington, KY, USA.

出版信息

bioRxiv. 2024 Aug 9:2024.08.07.607025. doi: 10.1101/2024.08.07.607025.

DOI:10.1101/2024.08.07.607025
PMID:39149299
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11326255/
Abstract

Metabolism is the network of chemical reactions that sustain cellular life. Parts of this metabolic network are defined as metabolic pathways containing specific biochemical reactions. Products and reactants of these reactions are called metabolites, which are associated with certain human-defined metabolic pathways. Metabolic knowledgebases, such as the Kyoto Encyclopedia of Gene and Genomes (KEGG) contain metabolites, reactions, and pathway annotations; however, such resources are incomplete due to current limits of metabolic knowledge. To fill in missing metabolite pathway annotations, past machine learning models showed some success at predicting KEGG Level 2 pathway category involvement of metabolites based on their chemical structure. Here, we present the first machine learning model to predict metabolite association to more granular KEGG Level 3 metabolic pathways. We used a feature and dataset engineering approach to generate over one million metabolite-pathway entries in the dataset used to train a single binary classifier. This approach produced a mean Matthews correlation coefficient (MCC) of 0.806 ± 0.017 SD across 100 cross-validations iterations. The 172 Level 3 pathways were predicted with an overall MCC of 0.726. Moreover, metabolite association with the 12 Level 2 pathway categories were predicted with an overall MCC of 0.891, representing significant transfer learning from the Level 3 pathway entries. These are the best metabolite-pathway prediction results published so far in the field.

摘要

新陈代谢是维持细胞生命的化学反应网络。这个代谢网络的部分被定义为包含特定生化反应的代谢途径。这些反应的产物和反应物被称为代谢物,它们与某些人为定义的代谢途径相关。代谢知识库,如京都基因与基因组百科全书(KEGG),包含代谢物、反应和途径注释;然而,由于当前代谢知识的局限性,这些资源并不完整。为了填补缺失的代谢物途径注释,过去的机器学习模型在基于代谢物化学结构预测其参与KEGG二级途径类别的方面取得了一些成功。在这里,我们提出了第一个机器学习模型,用于预测代谢物与更细化的KEGG三级代谢途径的关联。我们使用了一种特征和数据集工程方法,在用于训练单个二元分类器的数据集中生成了超过一百万个代谢物-途径条目。在100次交叉验证迭代中,这种方法产生的平均马修斯相关系数(MCC)为0.806±0.017标准差。对172条三级途径的预测总体MCC为0.726。此外,对代谢物与12个二级途径类别的关联预测总体MCC为0.891,这代表了从三级途径条目中进行的显著迁移学习。这些是该领域目前已发表的最佳代谢物-途径预测结果。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1d56/11326255/2ab9ee4d211e/nihpp-2024.08.07.607025v1-f0007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1d56/11326255/f3bca32900bb/nihpp-2024.08.07.607025v1-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1d56/11326255/a989626b9875/nihpp-2024.08.07.607025v1-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1d56/11326255/e050b4836718/nihpp-2024.08.07.607025v1-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1d56/11326255/5dd56a39063d/nihpp-2024.08.07.607025v1-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1d56/11326255/bc70b9a3c20a/nihpp-2024.08.07.607025v1-f0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1d56/11326255/44b2164c2d4f/nihpp-2024.08.07.607025v1-f0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1d56/11326255/2ab9ee4d211e/nihpp-2024.08.07.607025v1-f0007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1d56/11326255/f3bca32900bb/nihpp-2024.08.07.607025v1-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1d56/11326255/a989626b9875/nihpp-2024.08.07.607025v1-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1d56/11326255/e050b4836718/nihpp-2024.08.07.607025v1-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1d56/11326255/5dd56a39063d/nihpp-2024.08.07.607025v1-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1d56/11326255/bc70b9a3c20a/nihpp-2024.08.07.607025v1-f0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1d56/11326255/44b2164c2d4f/nihpp-2024.08.07.607025v1-f0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1d56/11326255/2ab9ee4d211e/nihpp-2024.08.07.607025v1-f0007.jpg

相似文献

1
Predicting the Pathway Involvement of Metabolites in Both Pathway Categories and Individual Pathways.预测代谢物在通路类别和单个通路中的通路参与情况。
bioRxiv. 2024 Aug 9:2024.08.07.607025. doi: 10.1101/2024.08.07.607025.
2
Predicting the Association of Metabolites with Both Pathway Categories and Individual Pathways.预测代谢物与通路类别及单个通路之间的关联。
Metabolites. 2024 Sep 21;14(9):510. doi: 10.3390/metabo14090510.
3
Predicting the Pathway Involvement of Metabolites Based on Combined Metabolite and Pathway Features.基于代谢物和通路特征组合预测代谢物的通路参与情况
Metabolites. 2024 May 7;14(5):266. doi: 10.3390/metabo14050266.
4
Predicting The Pathway Involvement Of Metabolites Based on Combined Metabolite and Pathway Features.基于代谢物和通路特征组合预测代谢物的通路参与情况
bioRxiv. 2024 Apr 2:2024.04.01.587582. doi: 10.1101/2024.04.01.587582.
5
Benchmark dataset for training machine learning models to predict the pathway involvement of metabolites.用于训练机器学习模型以预测代谢物途径参与情况的基准数据集。
bioRxiv. 2023 Oct 9:2023.10.03.560715. doi: 10.1101/2023.10.03.560715.
6
Benchmark Dataset for Training Machine Learning Models to Predict the Pathway Involvement of Metabolites.用于训练机器学习模型以预测代谢物途径参与情况的基准数据集。
Metabolites. 2023 Nov 1;13(11):1120. doi: 10.3390/metabo13111120.
7
A cautionary tale about properly vetting datasets used in supervised learning predicting metabolic pathway involvement.一个关于在监督学习中使用预测代谢途径参与的数据集进行适当验证的警示故事。
PLoS One. 2024 May 2;19(5):e0299583. doi: 10.1371/journal.pone.0299583. eCollection 2024.
8
Machine Learning Using Neural Networks for Metabolomic Pathway Analyses.基于神经网络的代谢组学通路分析的机器学习方法
Methods Mol Biol. 2023;2553:395-415. doi: 10.1007/978-1-0716-2617-7_17.
9
Recognition of early and late stages of bladder cancer using metabolites and machine learning.利用代谢物和机器学习识别膀胱癌的早期和晚期。
Metabolomics. 2019 Jun 20;15(7):94. doi: 10.1007/s11306-019-1555-9.
10
[A novel method for efficient screening and annotation of important pathway-associated metabolites based on the modified metabolome and probe molecules].一种基于改良代谢组和探针分子的重要通路相关代谢物高效筛选与注释新方法
Se Pu. 2022 Sep;40(9):788-796. doi: 10.3724/SP.J.1123.2022.03025.

本文引用的文献

1
Predicting the Pathway Involvement of Metabolites Based on Combined Metabolite and Pathway Features.基于代谢物和通路特征组合预测代谢物的通路参与情况
Metabolites. 2024 May 7;14(5):266. doi: 10.3390/metabo14050266.
2
A cautionary tale about properly vetting datasets used in supervised learning predicting metabolic pathway involvement.一个关于在监督学习中使用预测代谢途径参与的数据集进行适当验证的警示故事。
PLoS One. 2024 May 2;19(5):e0299583. doi: 10.1371/journal.pone.0299583. eCollection 2024.
3
md_harmonize: A Python Package for Atom-Level Harmonization of Public Metabolic Databases.
md_harmonize:一个用于公共代谢数据库原子级协调的Python包。
Metabolites. 2023 Dec 17;13(12):1199. doi: 10.3390/metabo13121199.
4
Benchmark Dataset for Training Machine Learning Models to Predict the Pathway Involvement of Metabolites.用于训练机器学习模型以预测代谢物途径参与情况的基准数据集。
Metabolites. 2023 Nov 1;13(11):1120. doi: 10.3390/metabo13111120.
5
The Reactome Pathway Knowledgebase 2024.Reactome 通路知识库 2024.
Nucleic Acids Res. 2024 Jan 5;52(D1):D672-D678. doi: 10.1093/nar/gkad1025.
6
kegg_pull: a software package for the RESTful access and pulling from the Kyoto Encyclopedia of Gene and Genomes.KEGG_PULL:一个用于通过 RESTful 访问和从京都基因与基因组百科全书(KEGG)中提取数据的软件包。
BMC Bioinformatics. 2023 Mar 4;24(1):78. doi: 10.1186/s12859-023-05208-0.
7
KEGG for taxonomy-based analysis of pathways and genomes.KEGG 用于基于分类的途径和基因组分析。
Nucleic Acids Res. 2023 Jan 6;51(D1):D587-D592. doi: 10.1093/nar/gkac963.
8
MLGL-MP: a Multi-Label Graph Learning framework enhanced by pathway interdependence for Metabolic Pathway prediction.MLGL-MP:一种通过途径相互依赖性增强的多标签图学习框架,用于代谢途径预测。
Bioinformatics. 2022 Jun 24;38(Suppl 1):i325-i332. doi: 10.1093/bioinformatics/btac222.
9
Array programming with NumPy.使用 NumPy 进行数组编程。
Nature. 2020 Sep;585(7825):357-362. doi: 10.1038/s41586-020-2649-2. Epub 2020 Sep 16.
10
SciPy 1.0: fundamental algorithms for scientific computing in Python.SciPy 1.0:Python 中的科学计算基础算法。
Nat Methods. 2020 Mar;17(3):261-272. doi: 10.1038/s41592-019-0686-2. Epub 2020 Feb 3.