D-CyPre:一种基于机器学习的工具,用于准确预测人类细胞色素P450酶代谢位点。
D-CyPre: a machine learning-based tool for accurate prediction of human CYP450 enzyme metabolic sites.
作者信息
Yang Haolan, Liu Jie, Chen Kui, Cong Shiyu, Cai Shengnan, Li Yueting, Jia Zhixin, Wu Hao, Lou Tianyu, Wei Zuying, Yang Xiaoqin, Xiao Hongbin
机构信息
School of Chinese Materia Medica, Beijing University of Chinese Medicine, Beijing, China.
Beijing University of Chinese Medicine, Research Center of Chinese Medicine Analysis and Transformation, Beijing, China.
出版信息
PeerJ Comput Sci. 2024 May 7;10:e2040. doi: 10.7717/peerj-cs.2040. eCollection 2024.
The advancement of graph neural networks (GNNs) has made it possible to accurately predict metabolic sites. Despite the combination of GNNs with XGBOOST showing impressive performance, this technology has not yet been applied in the realm of metabolic site prediction. Previous metabolic site prediction tools focused on bonds and atoms, regardless of the overall molecular skeleton. This study introduces a novel tool, named D-CyPre, that amalgamates atom, bond, and molecular skeleton information two directed message-passing neural networks (D-MPNN) to predict the metabolic sites of the nine cytochrome P450 enzymes using XGBOOST. In D-CyPre Precision Mode, the model produces fewer, but more accurate results (Jaccard score: 0.497, F1: 0.660, and precision: 0.737 in the test set). In D-CyPre Recall Mode, the model produces less accurate, but more comprehensive results (Jaccard score: 0.506, F1: 0.669, and recall: 0.720 in the test set). In the test set of 68 reactants, D-CyPre outperformed BioTransformer on all isoenzymes and CyProduct on most isoenzymes (5/9). For the subtypes where D-CyPre outperformed CyProducts, the Jaccard score and F1 scores increased by 24% and 16% in Precision Mode (4/9) and 19% and 12% in Recall Mode (5/9), respectively, relative to the second-best CyProduct. Overall, D-CyPre provides more accurate prediction results for human CYP450 enzyme metabolic sites.
图神经网络(GNN)的发展使得准确预测代谢位点成为可能。尽管GNN与XGBOOST相结合表现出了令人印象深刻的性能,但这项技术尚未应用于代谢位点预测领域。以前的代谢位点预测工具侧重于键和原子,而忽略了整体分子骨架。本研究引入了一种名为D-CyPre的新型工具,它将原子、键和分子骨架信息与两个有向消息传递神经网络(D-MPNN)相结合,以使用XGBOOST预测九种细胞色素P450酶的代谢位点。在D-CyPre精确模式下,该模型产生的结果较少,但更准确(测试集中的杰卡德分数:0.497,F1分数:0.660,精确率:0.737)。在D-CyPre召回模式下,该模型产生的结果准确性较低,但更全面(测试集中的杰卡德分数:0.506,F1分数:0.669,召回率:0.720)。在68种反应物的测试集中,D-CyPre在所有同工酶上的表现均优于BioTransformer,在大多数同工酶(5/9)上的表现优于CyProduct。对于D-CyPre表现优于CyProduct的亚型,相对于次优的CyProduct,在精确模式(4/9)下杰卡德分数和F1分数分别提高了24%和16%,在召回模式(5/9)下分别提高了19%和12%。总体而言,D-CyPre为人类CYP450酶代谢位点提供了更准确的预测结果。
相似文献
PeerJ Comput Sci. 2024-5-7
J Chem Inf Model. 2021-6-28
Phys Chem Chem Phys. 2021-9-15
J Chem Inf Model. 2023-3-27
J Chem Inf Model. 2021-6-28
本文引用的文献
Proc Natl Acad Sci U S A. 2021-9-28
J Chem Inf Model. 2021-6-28
J Chem Inf Model. 2021-6-28
Chem Res Toxicol. 2021-2-15
Cell. 2020-2-20
J Chem Inf Model. 2020-3-23