基于机器学习的酶反应预测模型的探索与评估。

Exploration and Evaluation of Machine Learning-Based Models for Predicting Enzymatic Reactions.

机构信息

Department of Chemical Science and Engineering Graduate School of Engineering, Kobe University, 1-1 Rokkodai-cho, Nada, Kobe, Hyogo 657-8501 Japan.

Graduate School of Medicine, Kyoto University, 54 Kawahara-cho, Shogoin Sakyo-ku, Kyoto 606-8507, Japan.

出版信息

J Chem Inf Model. 2020 Mar 23;60(3):1833-1843. doi: 10.1021/acs.jcim.9b00877. Epub 2020 Feb 27.

DOI:10.1021/acs.jcim.9b00877

PMID:32053362

Abstract

Unannotated gene sequences in databases are increasing due to sequencing advances. Therefore, computational methods to predict functions of unannotated genes are needed. Moreover, novel enzyme discovery for metabolic engineering applications further encourages annotation of sequences. Here, enzyme functions are predicted using two general approaches, each including several machine learning algorithms. First, Enzyme-models (E-models) predict Enzyme Commission (EC) numbers from amino acid sequence information. Second, Substrate-Enzyme models (SE-models) are built to predict substrates of enzymatic reactions together with EC numbers, and Substrate-Enzyme-Product models (SEP-models) are built to predict substrates, products, and EC numbers. While accuracy of E-models is not optimal, SE-models and SEP-models predict EC numbers and reactions with high accuracy using all tested machine learning-based methods. For example, a single Random Forests-based SEP-model predicts EC first digits with an Average AUC score of over 0.94. Various metrics indicate that the current strategy of combining sequence and chemical structure information is effective at improving enzyme reaction prediction.

摘要

由于测序技术的进步，数据库中未注释的基因序列不断增加。因此，需要开发计算方法来预测未注释基因的功能。此外，新型酶的发现也促进了代谢工程应用中的序列注释。在这里，使用两种通用方法预测酶的功能，每种方法都包含几种机器学习算法。首先，酶模型 (E-model) 根据氨基酸序列信息预测酶委员会 (EC) 编号。其次，构建底物-酶模型 (SE-model) 来预测酶反应的底物以及 EC 编号，并且构建底物-酶-产物模型 (SEP-model) 来预测底物、产物和 EC 编号。虽然 E-model 的准确性不是最佳的，但 SE-model 和 SEP-model 使用所有测试的基于机器学习的方法以高精度预测 EC 编号和反应。例如，单个基于随机森林的 SEP-model 预测 EC 前几位的平均 AUC 得分超过 0.94。各种指标表明，结合序列和化学结构信息的当前策略在提高酶反应预测方面是有效的。

相似文献

Exploration and Evaluation of Machine Learning-Based Models for Predicting Enzymatic Reactions.基于机器学习的酶反应预测模型的探索与评估。

J Chem Inf Model. 2020 Mar 23;60(3):1833-1843. doi: 10.1021/acs.jcim.9b00877. Epub 2020 Feb 27.

Comprehensive Machine Learning Prediction of Extensive Enzymatic Reactions.全面的机器学习预测广泛的酶反应。

J Phys Chem B. 2022 Sep 15;126(36):6762-6770. doi: 10.1021/acs.jpcb.2c03287. Epub 2022 Sep 2.

Deep learning enables high-quality and high-throughput prediction of enzyme commission numbers.深度学习可实现酶委员会编号的高质量和高通量预测。

Proc Natl Acad Sci U S A. 2019 Jul 9;116(28):13996-14001. doi: 10.1073/pnas.1821905116. Epub 2019 Jun 20.

Accurately predicting enzyme functions through geometric graph learning on ESMFold-predicted structures.通过在 ESMFold 预测结构上进行几何图形学习，准确预测酶功能。

Nat Commun. 2024 Sep 18;15(1):8180. doi: 10.1038/s41467-024-52533-w.

A Survey for Predicting Enzyme Family Classes Using Machine Learning Methods.基于机器学习方法的酶家族分类预测研究。

Curr Drug Targets. 2019;20(5):540-550. doi: 10.2174/1389450119666181002143355.

Predicting metabolic pathways of plant enzymes without using sequence similarity: Models from machine learning.预测植物酶的代谢途径而不使用序列相似性：来自机器学习的模型。

Plant Genome. 2020 Nov;13(3):e20043. doi: 10.1002/tpg2.20043. Epub 2020 Aug 28.

ECOH: an enzyme commission number predictor using mutual information and a support vector machine.ECOH：一种基于互信息和支持向量机的酶分类号预测器。

Bioinformatics. 2013 Feb 1;29(3):365-72. doi: 10.1093/bioinformatics/bts700. Epub 2012 Dec 5.

Reaction graph kernels predict EC numbers of unknown enzymatic reactions in plant secondary metabolism.反应图核函数可预测植物次生代谢中未知酶反应的 EC 编号。

BMC Bioinformatics. 2010 Jan 18;11 Suppl 1(Suppl 1):S31. doi: 10.1186/1471-2105-11-S1-S31.

ProtEC: A Transformer Based Deep Learning System for Accurate Annotation of Enzyme Commission Numbers.ProtEC：一种基于 Transformer 的深度学习系统，用于准确注释酶委员会编号。

IEEE/ACM Trans Comput Biol Bioinform. 2023 Nov-Dec;20(6):3691-3702. doi: 10.1109/TCBB.2023.3311427. Epub 2023 Dec 25.

A general model for predicting enzyme functions based on enzymatic reactions.一种基于酶促反应预测酶功能的通用模型。

J Cheminform. 2024 Mar 31;16(1):38. doi: 10.1186/s13321-024-00827-y.

引用本文的文献

Functional annotation of enzyme-encoding genes using deep learning with transformer layers.利用带有转换器层的深度学习对酶编码基因进行功能注释。

Nat Commun. 2023 Nov 14;14(1):7370. doi: 10.1038/s41467-023-43216-z.

Design and Construction of Artificial Biological Systems for One-Carbon Utilization.用于一碳利用的人工生物系统的设计与构建。

Biodes Res. 2023 Oct 31;5:0021. doi: 10.34133/bdr.0021. eCollection 2023.

Prediction of Enzyme Catalysis by Computing Reaction Energy Barriers via Steered QM/MM Molecular Dynamics Simulations and Machine Learning.通过导向的 QM/MM 分子动力学模拟和机器学习计算反应能量势垒来预测酶催化。

J Chem Inf Model. 2023 Aug 14;63(15):4623-4632. doi: 10.1021/acs.jcim.3c00772. Epub 2023 Jul 21.

Predicting Genetic Disorder and Types of Disorder Using Chain Classifier Approach.使用链式分类器方法预测遗传疾病和疾病类型。

Genes (Basel). 2022 Dec 26;14(1):71. doi: 10.3390/genes14010071.

Machine learning discovery of missing links that mediate alternative branches to plant alkaloids.机器学习发现介导植物生物碱替代分支的缺失环节。

Nat Commun. 2022 Mar 16;13(1):1405. doi: 10.1038/s41467-022-28883-8.

EHreact: Extended Hasse Diagrams for the Extraction and Scoring of Enzymatic Reaction Templates.EHreact：用于提取和评分酶反应模板的扩展哈塞图。

J Chem Inf Model. 2021 Oct 25;61(10):4949-4961. doi: 10.1021/acs.jcim.1c00921. Epub 2021 Sep 29.

New Trends in Bioremediation Technologies Toward Environment-Friendly Society: A Mini-Review.面向环境友好型社会的生物修复技术新趋势：一篇综述短文

Front Bioeng Biotechnol. 2021 Aug 2;9:666858. doi: 10.3389/fbioe.2021.666858. eCollection 2021.

Machine Learning for Electronically Excited States of Molecules.机器学习在分子激发态中的应用。

Chem Rev. 2021 Aug 25;121(16):9873-9926. doi: 10.1021/acs.chemrev.0c00749. Epub 2020 Nov 19.

EnzyMine: a comprehensive database for enzyme function annotation with enzymatic reaction chemical feature.酶库（EnzyMine）：一个具有酶促反应化学特征的酶功能注释综合数据库。

Database (Oxford). 2020 Oct 1;2023. doi: 10.1093/database/baaa065.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

基于机器学习的酶反应预测模型的探索与评估。

Exploration and Evaluation of Machine Learning-Based Models for Predicting Enzymatic Reactions.

机构信息

出版信息

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献