• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

相似文献

1
Machine learning methods for metabolic pathway prediction.机器学习方法在代谢途径预测中的应用。
BMC Bioinformatics. 2010 Jan 8;11:15. doi: 10.1186/1471-2105-11-15.
2
DeepRF: A deep learning method for predicting metabolic pathways in organisms based on annotated genomes.DeepRF:一种基于注释基因组预测生物体代谢途径的深度学习方法。
Comput Biol Med. 2022 Aug;147:105756. doi: 10.1016/j.compbiomed.2022.105756. Epub 2022 Jun 20.
3
A Bayesian method for identifying missing enzymes in predicted metabolic pathway databases.一种用于在预测的代谢途径数据库中识别缺失酶的贝叶斯方法。
BMC Bioinformatics. 2004 Jun 9;5:76. doi: 10.1186/1471-2105-5-76.
4
Multi-label classification with XGBoost for metabolic pathway prediction.基于 XGBoost 的代谢通路预测的多标签分类。
BMC Bioinformatics. 2024 Feb 1;25(1):52. doi: 10.1186/s12859-024-05666-0.
5
The MetaCyc Database of metabolic pathways and enzymes and the BioCyc collection of Pathway/Genome Databases.代谢途径与酶的MetaCyc数据库以及途径/基因组数据库的BioCyc集合。
Nucleic Acids Res. 2008 Jan;36(Database issue):D623-31. doi: 10.1093/nar/gkm900. Epub 2007 Oct 27.
6
Evaluation of computational metabolic-pathway predictions for Helicobacter pylori.幽门螺杆菌计算代谢途径预测的评估
Bioinformatics. 2002 May;18(5):715-24. doi: 10.1093/bioinformatics/18.5.715.
7
Metabolome searcher: a high throughput tool for metabolite identification and metabolic pathway mapping directly from mass spectrometry and using genome restriction.代谢组搜索器:一种直接从质谱数据并利用基因组限制进行代谢物鉴定和代谢途径映射的高通量工具。
BMC Bioinformatics. 2015 Feb 25;16(1):62. doi: 10.1186/s12859-015-0462-y.
8
The MetaCyc database of metabolic pathways and enzymes and the BioCyc collection of pathway/genome databases.MetaCyc 数据库包含代谢途径和酶,以及 BioCyc 集合的途径/基因组数据库。
Nucleic Acids Res. 2012 Jan;40(Database issue):D742-53. doi: 10.1093/nar/gkr1014. Epub 2011 Nov 18.
9
gapseq: informed prediction of bacterial metabolic pathways and reconstruction of accurate metabolic models.gapseq:细菌代谢途径的信息预测和精确代谢模型的重建。
Genome Biol. 2021 Mar 10;22(1):81. doi: 10.1186/s13059-021-02295-1.
10
Metabolic network prediction through pairwise rational kernels.通过成对有理核进行代谢网络预测。
BMC Bioinformatics. 2014 Sep 26;15(1):318. doi: 10.1186/1471-2105-15-318.

引用本文的文献

1
Using supervised machine-learning approaches to understand abiotic stress tolerance and design resilient crops.利用监督式机器学习方法来理解非生物胁迫耐受性并设计抗逆作物。
Philos Trans R Soc Lond B Biol Sci. 2025 May 29;380(1927):20240252. doi: 10.1098/rstb.2024.0252.
2
Learning motif features and topological structure of molecules for metabolic pathway prediction.学习用于代谢途径预测的分子基序特征和拓扑结构。
J Cheminform. 2025 Apr 21;17(1):56. doi: 10.1186/s13321-025-00994-6.
3
Artificial intelligence driven innovations in biochemistry: A review of emerging research frontiers.人工智能驱动的生物化学创新:新兴研究前沿综述
Biomol Biomed. 2025 Mar 7;25(4):739-750. doi: 10.17305/bb.2024.11537.
4
Screening of genes co-associated with osteoporosis and chronic HBV infection based on bioinformatics analysis and machine learning.基于生物信息学分析和机器学习的与骨质疏松症和慢性乙型肝炎感染相关的基因联合筛查。
Front Immunol. 2024 Sep 16;15:1472354. doi: 10.3389/fimmu.2024.1472354. eCollection 2024.
5
Multi-label classification with XGBoost for metabolic pathway prediction.基于 XGBoost 的代谢通路预测的多标签分类。
BMC Bioinformatics. 2024 Feb 1;25(1):52. doi: 10.1186/s12859-024-05666-0.
6
Machine learning for metabolic pathway optimization: A review.用于代谢途径优化的机器学习:综述
Comput Struct Biotechnol J. 2023 Mar 27;21:2381-2393. doi: 10.1016/j.csbj.2023.03.045. eCollection 2023.
7
Predicting pathways for old and new metabolites through clustering.通过聚类预测新旧代谢物的途径。
J Theor Biol. 2024 Feb 7;578:111684. doi: 10.1016/j.jtbi.2023.111684. Epub 2023 Dec 3.
8
Predicting metabolic fluxes from omics data via machine learning: Moving from knowledge-driven towards data-driven approaches.通过机器学习从组学数据预测代谢通量:从知识驱动方法向数据驱动方法的转变。
Comput Struct Biotechnol J. 2023 Oct 5;21:4960-4973. doi: 10.1016/j.csbj.2023.10.002. eCollection 2023.
9
Data-Driven Synthetic Cell Factories Development for Industrial Biomanufacturing.用于工业生物制造的数据驱动型合成细胞工厂开发
Biodes Res. 2022 Jun 15;2022:9898461. doi: 10.34133/2022/9898461. eCollection 2022.
10
Metabolomics and modelling approaches for systems metabolic engineering.用于系统代谢工程的代谢组学和建模方法。
Metab Eng Commun. 2022 Oct 15;15:e00209. doi: 10.1016/j.mec.2022.e00209. eCollection 2022 Dec.

本文引用的文献

1
Reconstruction of metabolic pathways for the cattle genome.牛基因组代谢途径的重建。
BMC Syst Biol. 2009 Mar 12;3:33. doi: 10.1186/1752-0509-3-33.
2
Uncovering metabolic pathways relevant to phenotypic traits of microbial genomes.揭示与微生物基因组表型特征相关的代谢途径。
Genome Biol. 2009;10(3):R28. doi: 10.1186/gb-2009-10-3-r28. Epub 2009 Mar 10.
3
Reactome knowledgebase of human biological pathways and processes.人类生物途径和过程的Reactome知识库。
Nucleic Acids Res. 2009 Jan;37(Database issue):D619-22. doi: 10.1093/nar/gkn863. Epub 2008 Nov 3.
4
EcoCyc: a comprehensive view of Escherichia coli biology.《大肠杆菌代谢数据库(EcoCyc):大肠杆菌生物学全景》
Nucleic Acids Res. 2009 Jan;37(Database issue):D464-70. doi: 10.1093/nar/gkn751. Epub 2008 Oct 30.
5
An environmental perspective on large-scale genome clustering based on metabolic capabilities.基于代谢能力的大规模基因组聚类的环境视角。
Bioinformatics. 2008 Aug 15;24(16):i56-62. doi: 10.1093/bioinformatics/btn302.
6
KEGG Atlas mapping for global analysis of metabolic pathways.用于代谢途径全局分析的KEGG图谱映射
Nucleic Acids Res. 2008 Jul 1;36(Web Server issue):W423-6. doi: 10.1093/nar/gkn282. Epub 2008 May 13.
7
The MetaCyc Database of metabolic pathways and enzymes and the BioCyc collection of Pathway/Genome Databases.代谢途径与酶的MetaCyc数据库以及途径/基因组数据库的BioCyc集合。
Nucleic Acids Res. 2008 Jan;36(Database issue):D623-31. doi: 10.1093/nar/gkm900. Epub 2007 Oct 27.
8
Mining biological networks for unknown pathways.挖掘生物网络以寻找未知途径。
Bioinformatics. 2007 Oct 15;23(20):2775-83. doi: 10.1093/bioinformatics/btm409. Epub 2007 Aug 30.
9
A genome-scale metabolic reconstruction for Escherichia coli K-12 MG1655 that accounts for 1260 ORFs and thermodynamic information.针对大肠杆菌K-12 MG1655的全基因组规模代谢重建,该重建考虑了1260个开放阅读框和热力学信息。
Mol Syst Biol. 2007;3:121. doi: 10.1038/msb4100155. Epub 2007 Jun 26.
10
Toward the automated generation of genome-scale metabolic networks in the SEED.迈向在SEED中自动生成基因组规模的代谢网络。
BMC Bioinformatics. 2007 Apr 26;8:139. doi: 10.1186/1471-2105-8-139.

机器学习方法在代谢途径预测中的应用。

Machine learning methods for metabolic pathway prediction.

机构信息

Bioinformatics Research Group, SRI International, 333 Ravenswood Ave, Menlo Park, CA 94025, USA.

出版信息

BMC Bioinformatics. 2010 Jan 8;11:15. doi: 10.1186/1471-2105-11-15.

DOI:10.1186/1471-2105-11-15
PMID:20064214
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3146072/
Abstract

BACKGROUND

A key challenge in systems biology is the reconstruction of an organism's metabolic network from its genome sequence. One strategy for addressing this problem is to predict which metabolic pathways, from a reference database of known pathways, are present in the organism, based on the annotated genome of the organism.

RESULTS

To quantitatively validate methods for pathway prediction, we developed a large "gold standard" dataset of 5,610 pathway instances known to be present or absent in curated metabolic pathway databases for six organisms. We defined a collection of 123 pathway features, whose information content we evaluated with respect to the gold standard. Feature data were used as input to an extensive collection of machine learning (ML) methods, including naïve Bayes, decision trees, and logistic regression, together with feature selection and ensemble methods. We compared the ML methods to the previous PathoLogic algorithm for pathway prediction using the gold standard dataset. We found that ML-based prediction methods can match the performance of the PathoLogic algorithm. PathoLogic achieved an accuracy of 91% and an F-measure of 0.786. The ML-based prediction methods achieved accuracy as high as 91.2% and F-measure as high as 0.787. The ML-based methods output a probability for each predicted pathway, whereas PathoLogic does not, which provides more information to the user and facilitates filtering of predicted pathways.

CONCLUSIONS

ML methods for pathway prediction perform as well as existing methods, and have qualitative advantages in terms of extensibility, tunability, and explainability. More advanced prediction methods and/or more sophisticated input features may improve the performance of ML methods. However, pathway prediction performance appears to be limited largely by the ability to correctly match enzymes to the reactions they catalyze based on genome annotations.

摘要

背景

系统生物学的一个关键挑战是根据其基因组序列重建生物体的代谢网络。解决此问题的一种策略是根据生物体的注释基因组,预测参考数据库中已知途径的哪些代谢途径存在于生物体中。

结果

为了定量验证途径预测方法,我们开发了一个包含 5610 个途径实例的大型“黄金标准”数据集,这些途径实例已知存在于或不存在于六个生物体的已编目代谢途径数据库中。我们定义了一组 123 个途径特征,我们根据黄金标准评估了它们的信息量。特征数据被用作大量机器学习 (ML) 方法的输入,包括朴素贝叶斯、决策树和逻辑回归,以及特征选择和集成方法。我们将 ML 方法与之前用于途径预测的 PathoLogic 算法使用黄金标准数据集进行了比较。我们发现基于 ML 的预测方法可以与 PathoLogic 算法的性能相匹配。PathoLogic 的准确率为 91%,F 度量为 0.786。基于 ML 的预测方法的准确率高达 91.2%,F 度量高达 0.787。基于 ML 的方法为每个预测途径输出一个概率,而 PathoLogic 则没有,这为用户提供了更多信息,并便于过滤预测途径。

结论

用于途径预测的 ML 方法的性能与现有方法相当,并且在可扩展性、可调整性和可解释性方面具有定性优势。更先进的预测方法和/或更复杂的输入特征可能会提高 ML 方法的性能。然而,途径预测性能似乎主要受到根据基因组注释正确将酶与它们催化的反应相匹配的能力的限制。