Department of Histology and Embryology, State Key Laboratory of Reproductive Medicine and Offspring Health, Nanjing Medical University, Nanjing 211166, China.
School of Medicine, Southeast University, Nanjing 210009, China.
Brief Bioinform. 2024 Jan 22;25(2). doi: 10.1093/bib/bbae055.
Protein subcellular localization (PSL) is very important in order to understand its functions, and its movement between subcellular niches within cells plays fundamental roles in biological process regulation. Mass spectrometry-based spatio-temporal proteomics technologies can help provide new insights of protein translocation, but bring the challenge in identifying reliable protein translocation events due to the noise interference and insufficient data mining. We propose a semi-supervised graph convolution network (GCN)-based framework termed TransGCN that infers protein translocation events from spatio-temporal proteomics. Based on expanded multiple distance features and joint graph representations of proteins, TransGCN utilizes the semi-supervised GCN to enable effective knowledge transfer from proteins with known PSLs for predicting protein localization and translocation. Our results demonstrate that TransGCN outperforms current state-of-the-art methods in identifying protein translocations, especially in coping with batch effects. It also exhibited excellent predictive accuracy in PSL prediction. TransGCN is freely available on GitHub at https://github.com/XuejiangGuo/TransGCN.
蛋白质亚细胞定位(PSL)对于理解其功能非常重要,其在细胞内亚细胞龛之间的运动在生物过程调节中起着基础性作用。基于质谱的时空蛋白质组学技术可以帮助提供蛋白质易位的新见解,但由于噪声干扰和数据挖掘不足,在识别可靠的蛋白质易位事件方面带来了挑战。我们提出了一种基于半监督图卷积网络(GCN)的框架,称为 TransGCN,用于从时空蛋白质组学中推断蛋白质易位事件。基于扩展的多种距离特征和蛋白质的联合图表示,TransGCN 利用半监督 GCN 实现了从具有已知 PSL 的蛋白质中进行有效知识转移,从而预测蛋白质定位和易位。我们的结果表明,TransGCN 在识别蛋白质易位方面优于当前最先进的方法,特别是在应对批次效应方面。它在 PSL 预测方面也表现出了出色的预测准确性。TransGCN 可在 GitHub 上免费获得,网址为 https://github.com/XuejiangGuo/TransGCN。