Maier Benjamin Dominik, Petursson Borgthor, Lussana Alessandro, Petsalaki Evangelia
European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire, CB10 1SD, United Kingdom.
European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire, CB10 1SD, United Kingdom.
Mol Cell Proteomics. 2025 May 15:100994. doi: 10.1016/j.mcpro.2025.100994.
Phosphorylation forms an important part of the signalling system that cells use for decision making and regulation of processes such as cell division and differentiation. In human, >90% of identified phosphosites don't have annotations regarding the relevant upstream kinase. At the same time around 30% of kinases (as annotated in Uniprot) have no known target. This knowledge gap stresses the need to make large scale, data-driven computational predictions. In this study, we have created a machine learning-based model to derive a probabilistic kinase-substrate network from omics datasets. Our methodology displays improved performance compared to other state-of-the-art kinase-substrate prediction methods and provides predictions for more kinases. Importantly, it better captures new experimentally-identified kinase-substrate relationships. It can therefore allow the improved prioritisation of kinase-substrate pairs for illuminating the dark human cell signalling space. Our model is integrated into a web server, SELPHI, to allow unbiased analysis of phosphoproteomics data, facilitating the design of downstream experiments to uncover mechanisms of signal transduction across conditions and cellular contexts.
磷酸化是细胞用于决策以及调控诸如细胞分裂和分化等过程的信号系统的重要组成部分。在人类中,超过90%已鉴定的磷酸化位点没有关于相关上游激酶的注释。与此同时,大约30%的激酶(如在Uniprot中注释的)没有已知的靶点。这种知识差距凸显了进行大规模、数据驱动的计算预测的必要性。在本研究中,我们创建了一个基于机器学习的模型,以从组学数据集中推导概率性激酶-底物网络。与其他最先进的激酶-底物预测方法相比,我们的方法表现出更好的性能,并且能为更多激酶提供预测。重要的是,它能更好地捕捉新的实验鉴定的激酶-底物关系。因此,它可以改进激酶-底物对的优先级排序,以阐明人类细胞信号转导的未知领域。我们的模型集成到了一个网络服务器SELPHI中,以允许对磷酸化蛋白质组学数据进行无偏分析,便于设计下游实验以揭示不同条件和细胞背景下的信号转导机制。