Liu Qi, Zhu He, Fang Zheng, Dong Mingming, Qin Hongqiang, Ye Mingliang
State Key Laboratory of Medical Proteomics, National Chromatographic R. & A. Center, CAS Key Laboratory of Separation Science for Analytical Chemistry, Dalian Institute of Chemical Physics, Chinese Academy of Sciences, Dalian, 116023, China.
University of Chinese Academy of Sciences, Beijing, 100049, China.
Anal Bioanal Chem. 2025 Feb;417(5):989-999. doi: 10.1007/s00216-024-05499-z. Epub 2024 Aug 29.
Protein glycosylation is a highly heterogeneous post-translational modification that has been demonstrated to exhibit significant variations in various diseases. Due to the differential patterns observed in disease and healthy populations, the glycosylated proteins hold promise as early indicators for multiple diseases. With the continuous development of liquid chromatography-mass spectrometry (LC-MS) technology and spectrum analysis software, the sensitivity for the decipher of the tandem mass spectra of the glycopeptides carrying intact glycans, i.e., intact glycopeptides, enzymatic hydrolyzed from glycoproteins has been significantly improved. From quantified intact glycopeptides, the difference of protein glycosylation at multiple levels, e.g., glycoprotein, glycan, glycosite, and site-specific glycans, could be obtained for different samples. However, the manual analysis of the intact glycopeptide quantitative data at multiple levels is tedious and time consuming. In this study, we have developed a software tool named "GP-Marker" to facilitate large-scale data mining of spectra dataset of intact N-glycopeptide at multiple levels. This software provides a user-friendly and interactive interface, offering operational tools for machine learning to researchers without programming backgrounds. It includes a range of visualization plots displaying differential glycosylation and provides the ability to extract multi-level data analysis from intact glycopeptide data quantified by Glyco-Decipher.
蛋白质糖基化是一种高度异质性的翻译后修饰,已被证明在各种疾病中表现出显著差异。由于在疾病人群和健康人群中观察到不同的模式,糖基化蛋白有望成为多种疾病的早期指标。随着液相色谱-质谱联用(LC-MS)技术和光谱分析软件的不断发展,从糖蛋白中酶解得到的携带完整聚糖的糖肽(即完整糖肽)的串联质谱解析灵敏度得到了显著提高。通过对完整糖肽进行定量分析,可以获得不同样品在多个水平上的蛋白质糖基化差异,如糖蛋白、聚糖、糖基化位点和位点特异性聚糖。然而,对多个水平的完整糖肽定量数据进行人工分析既繁琐又耗时。在本研究中,我们开发了一个名为“GP-Marker”的软件工具,以促进对完整N-糖肽光谱数据集进行多水平的大规模数据挖掘。该软件提供了一个用户友好的交互式界面,为没有编程背景的研究人员提供机器学习操作工具。它包括一系列显示糖基化差异的可视化图,并能够从通过Glyco-Decipher定量的完整糖肽数据中提取多水平数据分析。