State Key Laboratory of Microbial Metabolism, School of Life Science and Biotechnology, Shanghai Jiao Tong University, 800 Dongchuan RD. Minhang District, Shanghai 200240, China.
College of Science, Chongqing University of Technology, 69 Hongguang Avenue, Banan District, Chongqing 400054, China.
Brief Bioinform. 2024 Jul 25;25(5). doi: 10.1093/bib/bbae409.
Turnover numbers (kcat), which indicate an enzyme's catalytic efficiency, have a wide range of applications in fields including protein engineering and synthetic biology. Experimentally measuring the enzymes' kcat is always time-consuming. Recently, the prediction of kcat using deep learning models has mitigated this problem. However, the accuracy and robustness in kcat prediction still needs to be improved significantly, particularly when dealing with enzymes with low sequence similarity compared to those within the training dataset. Herein, we present DeepEnzyme, a cutting-edge deep learning model that combines the most recent Transformer and Graph Convolutional Network (GCN) to capture the information of both the sequence and 3D-structure of a protein. To improve the prediction accuracy, DeepEnzyme was trained by leveraging the integrated features from both sequences and 3D-structures. Consequently, DeepEnzyme exhibits remarkable robustness when processing enzymes with low sequence similarity compared to those in the training dataset by utilizing additional features from high-quality protein 3D-structures. DeepEnzyme also makes it possible to evaluate how point mutations affect the catalytic activity of the enzyme, which helps identify residue sites that are crucial for the catalytic function. In summary, DeepEnzyme represents a pioneering effort in predicting enzymes' kcat values with improved accuracy and robustness compared to previous algorithms. This advancement will significantly contribute to our comprehension of enzyme function and its evolutionary patterns across species.
周转率(kcat)是一个酶的催化效率的指标,在蛋白质工程和合成生物学等领域有广泛的应用。实验测量酶的 kcat 通常很耗时。最近,使用深度学习模型预测 kcat 缓解了这个问题。然而,kcat 预测的准确性和稳健性仍需要显著提高,特别是在处理与训练数据集相比序列相似性较低的酶时。在此,我们提出了 DeepEnzyme,这是一种先进的深度学习模型,它结合了最新的 Transformer 和图卷积网络(GCN),以捕获蛋白质序列和 3D 结构的信息。为了提高预测准确性,DeepEnzyme 通过利用序列和 3D 结构的综合特征进行训练。因此,与训练数据集中的酶相比,DeepEnzyme 处理序列相似性较低的酶时表现出显著的稳健性,利用了高质量蛋白质 3D 结构的附加特征。DeepEnzyme 还可以评估点突变如何影响酶的催化活性,这有助于识别对催化功能至关重要的残基位点。总之,与以前的算法相比,DeepEnzyme 在预测酶的 kcat 值方面具有更高的准确性和稳健性,这一进展将极大地促进我们对酶功能及其在物种间进化模式的理解。