College of Computer Science & Technology, Qingdao University, Qingdao 266071, China.
College of Life Sciences, Qingdao University, Qingdao 266071, China.
Methods. 2022 Jul;203:575-583. doi: 10.1016/j.ymeth.2021.09.008. Epub 2021 Sep 21.
Protein adenosine diphosphate-ribosylation (ADPr) is caused by the covalent binding of one or more ADP-ribose moieties to a target protein and regulates the biological functions of the target protein. To fully understand the regulatory mechanism of ADP-ribosylation, the essential step is the identification of the ADPr sites from the proteome. As the experimental approaches are costly and time-consuming, it is necessary to develop a computational tool to predict ADPr sites. Recently, serine has been found to be the major residue type for ADP-ribosylation but no predictor is available. In this study, we collected thousands of experimentally validated human ADPr sites on serine residue and constructed several different machine-learning classifiers. We found that the hybrid model, dubbed DeepSADPr, which integrated the one-dimensional convolutional neural network (CNN) with the One-Hot encoding approach and the word-embedding approach, compared favourably to other models in terms of both ten-fold cross-validation and independent test. Its AUC values reached 0.935 for ten-fold cross-validation. Its values of sensitivity, accuracy and Matthews's correlation coefficient reached 0.933, 0.867 and 0.740, respectively, with the fixed specificity value of 0.80. Overall, DeepSADPr is the first classifier for predicting Serine ADPr sites, which is available at http://www.bioinfogo.org/DeepSADPr.
蛋白质腺苷二磷酸核糖基化(ADPr)是通过将一个或多个 ADP-核糖基部分共价结合到靶蛋白上而引起的,调节靶蛋白的生物学功能。为了充分了解 ADPr 的调控机制,从蛋白质组中鉴定 ADPr 位点是必不可少的步骤。由于实验方法成本高且耗时,因此有必要开发一种计算工具来预测 ADPr 位点。最近,丝氨酸已被发现是 ADP-核糖基化的主要残基类型,但尚无预测器。在这项研究中,我们收集了数千个人类丝氨酸残基上经过实验验证的 ADPr 位点,并构建了几种不同的机器学习分类器。我们发现,集成了一维卷积神经网络(CNN)与 One-Hot 编码方法和词嵌入方法的混合模型(称为 DeepSADPr)在十折交叉验证和独立测试方面都优于其他模型。它在十折交叉验证中的 AUC 值达到 0.935。其敏感性、准确性和 Matthews 相关系数的值分别达到 0.933、0.867 和 0.740,而固定特异性值为 0.80。总体而言,DeepSADPr 是第一个用于预测丝氨酸 ADPr 位点的分类器,可在 http://www.bioinfogo.org/DeepSADPr 上获得。