Suppr超能文献

通过在 ESMFold 预测结构上进行几何图形学习,准确预测酶功能。

Accurately predicting enzyme functions through geometric graph learning on ESMFold-predicted structures.

机构信息

School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou, Guangdong, China.

High Performance Computing Department, National Supercomputing Center in Shenzhen, Shenzhen, Guangdong, China.

出版信息

Nat Commun. 2024 Sep 18;15(1):8180. doi: 10.1038/s41467-024-52533-w.

Abstract

Enzymes are crucial in numerous biological processes, with the Enzyme Commission (EC) number being a commonly used method for defining enzyme function. However, current EC number prediction technologies have not fully recognized the importance of enzyme active sites and structural characteristics. Here, we propose GraphEC, a geometric graph learning-based EC number predictor using the ESMFold-predicted structures and a pre-trained protein language model. Specifically, we first construct a model to predict the enzyme active sites, which is utilized to predict the EC number. The prediction is further improved through a label diffusion algorithm by incorporating homology information. In parallel, the optimum pH of enzymes is predicted to reflect the enzyme-catalyzed reactions. Experiments demonstrate the superior performance of our model in predicting active sites, EC numbers, and optimum pH compared to other state-of-the-art methods. Additional analysis reveals that GraphEC is capable of extracting functional information from protein structures, emphasizing the effectiveness of geometric graph learning. This technology can be used to identify unannotated enzyme functions, as well as to predict their active sites and optimum pH, with the potential to advance research in synthetic biology, genomics, and other fields.

摘要

酶在许多生物过程中都至关重要,酶委员会(EC)编号是定义酶功能的常用方法。然而,目前的 EC 编号预测技术尚未充分认识到酶活性位点和结构特征的重要性。在这里,我们提出了 GraphEC,这是一种基于几何图学习的 EC 编号预测器,使用 ESMFold 预测的结构和经过预训练的蛋白质语言模型。具体来说,我们首先构建了一个预测酶活性位点的模型,用于预测 EC 编号。通过结合同源信息的标签扩散算法进一步改进了预测。同时,预测酶的最适 pH 值以反映酶催化反应。实验表明,与其他最先进的方法相比,我们的模型在预测活性位点、EC 编号和最适 pH 值方面具有优异的性能。此外的分析表明,GraphEC 能够从蛋白质结构中提取功能信息,强调了几何图学习的有效性。这项技术可用于识别未注释的酶功能,以及预测其活性位点和最适 pH 值,有望推动合成生物学、基因组学和其他领域的研究。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f311/11411130/d36dba07e7da/41467_2024_52533_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验