Suppr超能文献

CPI-Pred:一种用于预测化合物-蛋白质相互作用功能参数的深度学习框架。

CPI-Pred: A deep learning framework for predicting functional parameters of compound-protein interactions.

作者信息

Xu Zhiqing, Barghout Rana Ahmed, Wu Jinghao, Garg Dhruv, Song Yun S, Mahadevan Radhakrishnan

机构信息

Department of Chemical Engineering and Applied Chemistry, University of Toronto, Toronto, ON, Canada.

Computer Science and Engineering Department, Thapar Institute of Engineering and Technology, Patiala, India.

出版信息

bioRxiv. 2025 Jan 21:2025.01.16.633372. doi: 10.1101/2025.01.16.633372.

Abstract

Recent advancements in deep learning have enabled functional annotation of genome sequences, facilitating the discovery of new enzymes and metabolites. However, accurately predicting compound-protein interactions (CPI) from sequences remains challenging due to the complexity of these interactions and the sparsity and heterogeneity of available data, which constrain the generalization of patterns across their solution space. In this work, we introduce CPI-Pred, a versatile deep learning model designed to predict compound-protein interaction function. CPI-Pred integrates compound representations derived from a novel message-passing neural network and enzyme representations generated by state-of-the-art protein language models, leveraging innovative sequence pooling and cross-attention mechanisms. To train and evaluate CPI-Pred, we compiled the largest dataset of enzyme kinetic parameters to date, encompassing four key metrics: the Michaelis-Menten constant ( ), enzyme turnover number ( ), catalytic efficiency ( ), and inhibition constant ( ). These kinetic parameters are critical for elucidating enzyme function in metabolic contexts and understanding their regulation by compounds within biological networks. We demonstrate that CPI-Pred can predict diverse types of CPI using only the amino acid sequence of enzymes and structural representations of compounds, outperforming state-of-the-art models on unseen compounds and structurally dissimilar enzymes. Over workflow provides a valuable tool for tackling a range of metabolic engineering challenges, including the designing of novel enzyme sequences and compounds, such as enzyme inhibitors. Additionally, the datasets curated in this study offer a valuable resource for the scientific community, serving as a benchmark for machine learning models focused on enzyme activity and promiscuity prediction.

摘要

深度学习的最新进展使得基因组序列的功能注释成为可能,有助于发现新的酶和代谢物。然而,由于这些相互作用的复杂性以及可用数据的稀疏性和异质性,从序列中准确预测化合物 - 蛋白质相互作用(CPI)仍然具有挑战性,这限制了模式在其解空间中的泛化。在这项工作中,我们引入了CPI - Pred,这是一种通用的深度学习模型,旨在预测化合物 - 蛋白质相互作用功能。CPI - Pred整合了源自新型消息传递神经网络的化合物表示和由最先进的蛋白质语言模型生成的酶表示,利用了创新的序列池化和交叉注意力机制。为了训练和评估CPI - Pred,我们编制了迄今为止最大的酶动力学参数数据集,涵盖四个关键指标:米氏常数( )、酶周转数( )、催化效率( )和抑制常数( )。这些动力学参数对于阐明代谢环境中的酶功能以及理解生物网络中化合物对它们的调节至关重要。我们证明,CPI - Pred仅使用酶的氨基酸序列和化合物的结构表示就能预测多种类型的CPI,在未见化合物和结构不同的酶上优于现有模型。我们的工作流程为应对一系列代谢工程挑战提供了一个有价值的工具,包括设计新型酶序列和化合物,如酶抑制剂。此外,本研究整理的数据集为科学界提供了宝贵的资源,可作为专注于酶活性和混杂性预测的机器学习模型的基准。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8175/11785036/0f922f61ef55/nihpp-2025.01.16.633372v1-f0001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验