Gu Chunhui, Ghasemi Seyyed Mahmood, Cai Yining, Fahrmann Johannes F, Long James P, Katayama Hiroyuki, Wu Chong, Vykoukal Jody, Dennison Jennifer B, Hanash Samir, Do Kim-Anh, Irajizad Ehsan
Department of Biostatistics, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, United States.
Department of Clinical Cancer Prevention, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, United States.
Bioinform Adv. 2025 Apr 26;5(1):vbaf095. doi: 10.1093/bioadv/vbaf095. eCollection 2025.
Protein identification via mass spectrometry (MS) is the primary method for untargeted protein detection. However, the identification process is challenging due to data complexity and the need to control false discovery rates (FDR) of protein identification. To address these challenges, we developed a graph neural network (GNN)-based model, Graph Neural Network using Protein-Protein Interaction for Enhancing Protein Identification (Grape-Pi), which is applicable to all proteomics pipelines. This model leverages protein-protein interaction (PPI) data and employs two types of message-passing layers to integrate evidence from both the target protein and its interactors, thereby improving identification accuracy.
Grape-Pi achieved significant improvements in area under receiver-operating characteristic curve (AUC) in differentiating present and absent proteins: 18% and 7% in two yeast samples and 9% in gastric samples over traditional methods in the test dataset. Additionally, proteins identified via Grape-Pi in gastric samples demonstrated a high correlation with mRNA data and identified gastric cancer proteins, like MAP4K4, missed by conventional methods.
Grape-Pi is freely available at https://zenodo.org/records/11310518 and https://github.com/FDUguchunhui/GrapePi.
通过质谱(MS)进行蛋白质鉴定是无靶向蛋白质检测的主要方法。然而,由于数据复杂性以及控制蛋白质鉴定错误发现率(FDR)的需求,鉴定过程具有挑战性。为应对这些挑战,我们开发了一种基于图神经网络(GNN)的模型,即利用蛋白质 - 蛋白质相互作用增强蛋白质鉴定的图神经网络(Grape - Pi),它适用于所有蛋白质组学流程。该模型利用蛋白质 - 蛋白质相互作用(PPI)数据,并采用两种类型的消息传递层来整合来自目标蛋白质及其相互作用蛋白的证据,从而提高鉴定准确性。
在测试数据集中,与传统方法相比,Grape - Pi在区分存在和不存在的蛋白质时,在接收器操作特征曲线(AUC)下面积方面取得了显著改进:在两个酵母样本中分别提高了18%和7%,在胃样本中提高了9%。此外,通过Grape - Pi在胃样本中鉴定出的蛋白质与mRNA数据以及传统方法遗漏的胃癌蛋白质(如MAP4K4)具有高度相关性。
Grape - Pi可在https://zenodo.org/records/11310518和https://github.com/FDUguchunhui/GrapePi上免费获取。