Wang Cheng, Yuan Chuang, Wang Yahui, Chen Ranran, Shi Yuying, Patti Gary J, Hou Qingzhen
bioRxiv. 2023 Mar 12:2023.03.08.531729. doi: 10.1101/2023.03.08.531729.
Enzymatic reaction networks are crucial to explore the mechanistic function of metabolites and proteins in biological systems and understanding the etiology of diseases and potential target for drug discovery. The increasing number of metabolic reactions allows the development of deep learning-based methods to discover new enzymatic reactions, which will expand the landscape of existing enzymatic reaction networks to investigate the disrupted metabolisms in diseases.
In this study, we propose the MPI-VGAE framework to predict metabolite-protein interactions (MPI) in a genome-scale heterogeneous enzymatic reaction network across ten organisms with thousands of enzymatic reactions. We improved the Variational Graph Autoencoders (VGAE) model to incorporate both molecular features of metabolites and proteins as well as neighboring features to achieve the best predictive performance of MPI. The MPI-VGAE framework showed robust performance in the reconstruction of hundreds of metabolic pathways and five functional enzymatic reaction networks. The MPI-VGAE framework was also applied to a homogenous metabolic reaction network and achieved as high performance as other state-of-art methods. Furthermore, the MPI-VGAE framework could be implemented to reconstruct the disease-specific MPI network based on hundreds of disrupted metabolites and proteins in Alzheimer's disease and colorectal cancer, respectively. A substantial number of new potential enzymatic reactions were predicted and validated by molecular docking. These results highlight the potential of the MPI-VGAE framework for the discovery of novel disease-related enzymatic reactions and drug targets in real-world applications.
The MPI-VGAE framework and datasets are publicly accessible on GitHub https://github.com/mmetalab/mpi-vgae .
received his Ph.D. in Chemistry from The Ohio State Univesity, USA. He is currently a Assistant Professor in School of Public Health at Shandong University, China. His research interests include bioinformatics, machine learning-based approach with applications to biomedical networks. is a research assistant at Shandong University. He obtained the MS degree in Biology at the University of Science and Technology of China. His research interests include biochemistry & molecular biology, cell biology, biomedicine, bioinformatics, and computational biology. is a PhD student in Department of Chemistry at Washington University in St. Louis. Her research interests include biochemistry, mass spectrometry-based metabolomics, and cancer metabolism. is a master graduate student in School of Public Health at University of Shandong, China. is a master graduate student in School of Public Health at University of Shandong, China. is the Michael and Tana Powell Professor at Washington University in St. Louis, where he holds appointments in the Department of Chemisrty and the Department of Medicine. He is also the Senior Director of the Center for Metabolomics and Isotope Tracing at Washington University. His research interests include metabolomics, bioinformatics, high-throughput mass spectrometry, environmental health, cancer, and aging. received his Ph.D. in Computer Science from Xiamen University, China. He is currently a Professor in School of Software at Shandong University, China. His research interests include machine learning and its applications to bioinformatics. received his Ph.D. in the Centre for Integrative Bioinformatics VU (IBIVU) from Vrije Universiteit Amsterdam, the Netherlands. Since 2020, He has serveved as the head of Bioinformatics Center in National Institute of Health Data Science of China and Assistant Professor in School of Public Health, Shandong University, China. His areas of research are bioinformatics and computational biophysics.
Genome-scale heterogeneous networks of metabolite-protein interaction (MPI) based on thousands of enzymatic reactions across ten organisms were constructed semi-automatically.An enzymatic reaction prediction method called Metabolite-Protein Interaction Variational Graph Autoencoders (MPI-VGAE) was developed and optimized to achieve higher performance compared with existing machine learning methods by using both molecular features of metabolites and proteins.MPI-VGAE is broadly useful for applications involving the reconstruction of metabolic pathways, functional enzymatic reaction networks, and homogenous networks (e.g., metabolic reaction networks).By implementing MPI-VGAE to Alzheimer's disease and colorectal cancer, we obtained several novel disease-related protein-metabolite reactions with biological meanings. Moreover, we further investigated the reasonable binding details of protein-metabolite interactions using molecular docking approaches which provided useful information for disease mechanism and drug design.
酶促反应网络对于探索生物系统中代谢物和蛋白质的机制功能、理解疾病病因以及药物发现的潜在靶点至关重要。代谢反应数量的不断增加使得基于深度学习的方法得以发展,以发现新的酶促反应,这将扩展现有酶促反应网络的版图,用于研究疾病中紊乱的代谢过程。
在本研究中,我们提出了MPI-VGAE框架,用于预测跨越十个生物体、包含数千个酶促反应的基因组规模异质酶促反应网络中的代谢物-蛋白质相互作用(MPI)。我们改进了变分图自动编码器(VGAE)模型,将代谢物和蛋白质的分子特征以及邻域特征纳入其中,以实现MPI的最佳预测性能。MPI-VGAE框架在数百条代谢途径和五个功能性酶促反应网络的重建中表现出强大的性能。MPI-VGAE框架还应用于同质性代谢反应网络,并取得了与其他现有先进方法相当的高性能。此外,MPI-VGAE框架可分别基于阿尔茨海默病和结直肠癌中数百种紊乱的代谢物和蛋白质,用于重建疾病特异性MPI网络。通过分子对接预测并验证了大量新的潜在酶促反应。这些结果突出了MPI-VGAE框架在实际应用中发现新型疾病相关酶促反应和药物靶点的潜力。
MPI-VGAE框架和数据集可在GitHub(https://github.com/mmetalab/mpi-vgae )上公开获取。
[作者1]在美国俄亥俄州立大学获得化学博士学位。他目前是中国山东大学公共卫生学院的助理教授。他的研究兴趣包括生物信息学、基于机器学习的方法及其在生物医学网络中的应用。[作者2]是山东大学的一名研究助理。他在中国科学技术大学获得生物学硕士学位。他的研究兴趣包括生物化学与分子生物学、细胞生物学、生物医学、生物信息学和计算生物学。[作者3]是美国圣路易斯华盛顿大学化学系的博士生。她的研究兴趣包括生物化学、基于质谱的代谢组学和癌症代谢。[作者4]是中国山东大学公共卫生学院的硕士研究生。[作者5]是中国山东大学公共卫生学院的硕士研究生。[作者6]是美国圣路易斯华盛顿大学的迈克尔和塔纳·鲍威尔教授,他在化学系和医学系任职。他还是华盛顿大学代谢组学和同位素示踪中心的高级主任。他的研究兴趣包括代谢组学、生物信息学、高通量质谱、环境卫生、癌症和衰老。[作者7]在中国厦门大学获得计算机科学博士学位。他目前是中国山东大学软件学院的教授。他的研究兴趣包括机器学习及其在生物信息学中的应用。[作者8]在荷兰阿姆斯特丹自由大学的综合生物信息学中心(IBIVU)获得博士学位。自2020年以来他担任中国国家卫生数据科学研究所生物信息学中心主任和中国山东大学公共卫生学院助理教授。他的研究领域是生物信息学和计算生物物理学。
基于十个生物体中数千个酶促反应,半自动构建了基因组规模的异质代谢物-蛋白质相互作用(MPI)网络。开发并优化了一种名为代谢物-蛋白质相互作用变分图自动编码器(MPI-VGAE)的酶促反应预测方法,通过使用代谢物和蛋白质的分子特征,与现有机器学习方法相比实现了更高的性能。MPI-VGAE广泛适用于涉及代谢途径重建、功能性酶促反应网络以及同质性网络(如代谢反应网络)的应用。通过将MPI-VGAE应用于阿尔茨海默病和结直肠癌,我们获得了几个具有生物学意义的新型疾病相关蛋白质-代谢物反应。此外,我们使用分子对接方法进一步研究了蛋白质-代谢物相互作用的合理结合细节,为疾病机制和药物设计提供了有用信息。