Department of Technical Support, Gansu Computing Centre , Lanzhou, 730000, China.
Department of Pharmacy, First Hospital of Lanzhou University , Lanzhou, 730000, China.
J Chem Inf Model. 2015 Sep 28;55(9):2015-25. doi: 10.1021/acs.jcim.5b00276. Epub 2015 Aug 26.
S-Palmitoylation is a key regulatory mechanism controlling protein targeting, localization, stability, and activity. Since increasing evidence shows that its disruption is implicated in many human diseases, the identification of palmitoylation sites is attracting more attention. However, the computational methods that are published so far for this purpose have suffered from a poor balance of sensitivity and specificity; hence, it is difficult to get a good generalized prediction ability on an external validation set, which holds back the further analysis of associations between disruption of palmitoylation and human inherited diseases. In this work, we present a reliable identification method for protein S-palmitoylation sites, called SeqPalm, based on a series of newly composed features from protein sequences and the synthetic minority oversampling technique. With only 16 extracted key features, this approach achieves the most favorable prediction performance up to now with sensitivity, specificity, and Matthew's correlation coefficient values of 95.4%, 96.3%, and 0.917, respectively. Then, all known disease-associated variations are studied by SeqPalm. It is found that 243 potential loss or gain of palmitoylation sites are highly associated with human inherited disease. The analysis presents several potential therapeutic targets for inherited diseases associated with loss or gain of palmitoylation function. There are even biological evidence that are coordinate with our prediction results. Therefore, this work presents a novel approach to discover the molecular basis of pathogenesis associated with abnormal palmitoylation. SeqPalm is now available online at http://lishuyan.lzu.edu.cn/seqpalm , which can not only annotate the palmitoylation sites of proteins but also distinguish loss or gain of palmitoylation sites by protein variations.
S-棕榈酰化是一种控制蛋白质靶向、定位、稳定性和活性的关键调节机制。由于越来越多的证据表明其破坏与许多人类疾病有关,因此棕榈酰化位点的鉴定越来越受到关注。然而,迄今为止为此目的发布的计算方法在灵敏度和特异性之间存在着糟糕的平衡;因此,很难在外部验证集上获得良好的泛化预测能力,这阻碍了棕榈酰化破坏与人类遗传性疾病之间关联的进一步分析。在这项工作中,我们提出了一种可靠的蛋白质 S-棕榈酰化位点识别方法,称为 SeqPalm,该方法基于从蛋白质序列和合成少数过采样技术中提取的一系列新特征。该方法仅使用 16 个提取的关键特征,实现了迄今为止最有利的预测性能,灵敏度、特异性和 Matthew 相关系数值分别为 95.4%、96.3%和 0.917。然后,通过 SeqPalm 研究所有已知的疾病相关变异。发现 243 个潜在的棕榈酰化丢失或获得位点与人类遗传性疾病高度相关。该分析为与棕榈酰化功能丧失或获得相关的遗传性疾病提供了几个潜在的治疗靶点。甚至还有与我们的预测结果相协调的生物学证据。因此,这项工作提出了一种新的方法来发现与异常棕榈酰化相关的发病机制的分子基础。SeqPalm 现在可在 http://lishuyan.lzu.edu.cn/seqpalm 上在线使用,它不仅可以注释蛋白质的棕榈酰化位点,还可以通过蛋白质变异来区分棕榈酰化位点的丢失或获得。