Jiang Haoqiang, Shang Shipeng, Sha Yutong, Zhang Lin, He Ningning, Li Lei
College of Basic Medicine, Qingdao University, Qingdao, China.
Sino Genomics Technology Co., Ltd., Qingdao, China.
Front Cell Dev Biol. 2023 Apr 28;11:1149535. doi: 10.3389/fcell.2023.1149535. eCollection 2023.
The post-translational modification (PTM) crosstalk refers to the interactions between different types of PTMs that occur on the same residue site of a protein. The crosstalk sites generally have different characteristics from those with the single PTM type. Studies targeting the latter's features have been widely conducted, while studies on the former's characteristics are rare. For example, the characteristics of serine phosphorylation (pS) and serine ADP-ribosylation (SADPr) have been investigated, whereas those of their crosstalks (pSADPr) are unknown. In this study, we collected 3,250 human pSADPr, 7,520 SADPr, 151,227 pS and 80,096 unmodified serine sites and explored the features of the pSADPr sites. We found that the characteristics of pSADPr sites are more similar to those of SADPr compared to pS or unmodified serine sites. Moreover, the crosstalk sites are likely to be phosphorylated by some kinase families (e.g., AGC, CAMK, STE and TKL) rather than others (e.g., CK1 and CMGC). Additionally, we constructed three classifiers to predict pSADPr sites from the pS dataset, the SADPr dataset and the protein sequences separately. We built and evaluated five deep-learning classifiers in ten-fold cross-validation and independent test datasets. We also used the classifiers as base classifiers to develop a few stacking-based ensemble classifiers to improve performance. The best classifiers had the AUC values of 0.700, 0.914 and 0.954 for recognizing pSADPr sites from the SADPr, pS and unmodified serine sites, respectively. The lowest prediction accuracy was achieved by separating pSADPr and SADPr sites, which is consistent with the observation that pSADPr's characteristics are more similar to those of SADPr than the rest. Finally, we developed an online tool for extensively predicting human pSADPr sites based on the CNN classifier, dubbed EdeepSADPr. It is freely available through http://edeepsadpr.bioinfogo.org/. We expect our investigation will promote a comprehensive understanding of crosstalks.
翻译后修饰(PTM)串扰是指发生在蛋白质同一残基位点上的不同类型PTM之间的相互作用。串扰位点通常具有与单一PTM类型位点不同的特征。针对后者特征的研究已广泛开展,而对前者特征的研究却很少。例如,已对丝氨酸磷酸化(pS)和丝氨酸ADP - 核糖基化(SADPr)的特征进行了研究,但其串扰(pSADPr)的特征尚不清楚。在本研究中,我们收集了3250个人类pSADPr位点、7520个SADPr位点、151227个pS位点和80096个未修饰的丝氨酸位点,并探索了pSADPr位点的特征。我们发现,与pS或未修饰的丝氨酸位点相比,pSADPr位点的特征与SADPr位点的特征更相似。此外,串扰位点可能更容易被某些激酶家族(如AGC、CAMK、STE和TKL)磷酸化,而不是其他激酶家族(如CK1和CMGC)。此外,我们分别构建了三个分类器,用于从pS数据集(原文有误,应该是从pS数据集、SADPr数据集和蛋白质序列中预测pSADPr位点)。我们在十折交叉验证和独立测试数据集中构建并评估了五个深度学习分类器。我们还将这些分类器用作基础分类器,开发了一些基于堆叠的集成分类器以提高性能。对于从SADPr、pS和未修饰的丝氨酸位点识别pSADPr位点,最佳分类器的AUC值分别为0.700、0.914和0.954。将pSADPr和SADPr位点分开时预测准确率最低,这与观察到的pSADPr的特征比其他特征更类似于SADPr的特征一致。最后,我们基于CNN分类器开发了一个用于广泛预测人类pSADPr位点的在线工具,称为EdeepSADPr。可通过http://edeepsadpr.bioinfogo.org/免费获取。我们期望我们的研究将促进对串扰的全面理解。