Department of Computer Science, Tufts University, Medford, MA 02155, USA.
Department of Chemical and Biological Engineering, Tufts University, Medford, MA 02155, USA.
Bioinformatics. 2023 Mar 1;39(3). doi: 10.1093/bioinformatics/btad089.
While traditionally utilized for identifying site-specific metabolic activity within a compound to alter its interaction with a metabolizing enzyme, predicting the site-of-metabolism (SOM) is essential in analyzing the promiscuity of enzymes on substrates. The successful prediction of SOMs and the relevant promiscuous products has a wide range of applications that include creating extended metabolic models (EMMs) that account for enzyme promiscuity and the construction of novel heterologous synthesis pathways. There is therefore a need to develop generalized methods that can predict molecular SOMs for a wide range of metabolizing enzymes.
This article develops a Graph Neural Network (GNN) model for the classification of an atom (or a bond) being an SOM. Our model, GNN-SOM, is trained on enzymatic interactions, available in the KEGG database, that span all enzyme commission numbers. We demonstrate that GNN-SOM consistently outperforms baseline machine learning models, when trained on all enzymes, on Cytochrome P450 (CYP) enzymes, or on non-CYP enzymes. We showcase the utility of GNN-SOM in prioritizing predicted enzymatic products due to enzyme promiscuity for two biological applications: the construction of EMMs and the construction of synthesis pathways.
A python implementation of the trained SOM predictor model can be found at https://github.com/HassounLab/GNN-SOM.
Supplementary data are available at Bioinformatics online.
虽然传统上用于识别化合物中特定代谢部位的代谢活性以改变其与代谢酶的相互作用,但预测代谢部位(SOM)对于分析酶对底物的混杂性至关重要。成功预测 SOM 和相关混杂产物具有广泛的应用,包括创建考虑酶混杂性的扩展代谢模型 (EMMs) 和构建新的异源合成途径。因此,需要开发能够预测广泛代谢酶的分子 SOM 的通用方法。
本文开发了一种用于分类原子(或键)是否为 SOM 的图神经网络 (GNN) 模型。我们的模型 GNN-SOM 是在 KEGG 数据库中可用的酶相互作用上进行训练的,这些相互作用涵盖了所有酶委员会编号。我们证明,当在所有酶、细胞色素 P450 (CYP) 酶或非 CYP 酶上进行训练时,GNN-SOM 始终优于基线机器学习模型。我们展示了 GNN-SOM 在由于酶混杂性而对预测酶产物进行优先级排序的实用性,这对于两个生物学应用具有重要意义:构建 EMM 和构建合成途径。
可在 https://github.com/HassounLab/GNN-SOM 上找到训练有素的 SOM 预测模型的 Python 实现。
补充数据可在“生物信息学”在线获取。