State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University, Beijing, China.
PLoS One. 2011;6(7):e22930. doi: 10.1371/journal.pone.0022930. Epub 2011 Jul 29.
As one of the most important reversible protein post-translation modifications, ubiquitination has been reported to be involved in lots of biological processes and closely implicated with various diseases. To fully decipher the molecular mechanisms of ubiquitination-related biological processes, an initial but crucial step is the recognition of ubiquitylated substrates and the corresponding ubiquitination sites. Here, a new bioinformatics tool named CKSAAP_UbSite was developed to predict ubiquitination sites from protein sequences. With the assistance of Support Vector Machine (SVM), the highlight of CKSAAP_UbSite is to employ the composition of k-spaced amino acid pairs surrounding a query site (i.e. any lysine in a query sequence) as input. When trained and tested in the dataset of yeast ubiquitination sites (Radivojac et al, Proteins, 2010, 78: 365-380), a 100-fold cross-validation on a 1∶1 ratio of positive and negative samples revealed that the accuracy and MCC of CKSAAP_UbSite reached 73.40% and 0.4694, respectively. The proposed CKSAAP_UbSite has also been intensively benchmarked to exhibit better performance than some existing predictors, suggesting that it can be served as a useful tool to the community. Currently, CKSAAP_UbSite is freely accessible at http://protein.cau.edu.cn/cksaap_ubsite/. Moreover, we also found that the sequence patterns around ubiquitination sites are not conserved across different species. To ensure a reasonable prediction performance, the application of the current CKSAAP_UbSite should be limited to the proteome of yeast.
泛素化作为最重要的蛋白质翻译后修饰之一,据报道参与了许多生物过程,并与各种疾病密切相关。为了充分阐明泛素化相关生物过程的分子机制,最初但至关重要的一步是识别泛素化底物和相应的泛素化位点。在这里,开发了一种名为 CKSAAP_UbSite 的新生物信息学工具,用于从蛋白质序列预测泛素化位点。在支持向量机 (SVM) 的辅助下,CKSAAP_UbSite 的亮点是将查询位点(即查询序列中的任何赖氨酸)周围的 k 间隔氨基酸对的组成用作输入。在酵母泛素化位点数据集(Radivojac 等人,蛋白质,2010 年,78:365-380)中进行训练和测试时,对阳性和阴性样本的 1:1 比例进行 100 倍交叉验证,结果表明 CKSAAP_UbSite 的准确性和 MCC 分别达到 73.40%和 0.4694。所提出的 CKSAAP_UbSite 也经过了密集的基准测试,其性能优于一些现有的预测器,表明它可以成为社区的有用工具。目前,CKSAAP_UbSite 可在 http://protein.cau.edu.cn/cksaap_ubsite/ 免费获得。此外,我们还发现泛素化位点周围的序列模式在不同物种之间并不保守。为了确保合理的预测性能,当前 CKSAAP_UbSite 的应用应限于酵母的蛋白质组。