Tsai Chi-Hung, Chen Bo-Juen, Chan Chen-Hsiung, Liu Hsuan-Liang, Kao Cheng-Yan
Department of Computer Science and Information Engineering, National Taiwan University, Taipei, Taiwan 106.
Bioinformatics. 2005 Dec 15;21(24):4416-9. doi: 10.1093/bioinformatics/bti715. Epub 2005 Oct 13.
Predicting disulfide connectivity precisely helps towards the solution of protein structure prediction. In this study, a descriptor derived from the sequential distance between oxidized cysteines (denoted as DOC) is proposed. An approach using support vector machine (SVM) method based on weighted graph matching was further developed to predict the disulfide connectivity pattern in proteins. When DOC was applied, prediction accuracy of 63% for our SVM models could be achieved, which is significantly higher than those obtained from previous approaches. The results show that using the non-local descriptor DOC coupled with local sequence profiles significantly improves the prediction accuracy. These improvements demonstrate that DOC, with a proper scaling scheme, is an effective feature for the prediction of disulfide connectivity. The method developed in this work is available at the web server PreCys (prediction of cys-cys linkages of proteins).
精确预测二硫键连接有助于解决蛋白质结构预测问题。在本研究中,提出了一种基于氧化半胱氨酸之间序列距离的描述符(表示为DOC)。进一步开发了一种基于加权图匹配的支持向量机(SVM)方法来预测蛋白质中的二硫键连接模式。应用DOC时,我们的SVM模型预测准确率可达63%,显著高于先前方法获得的准确率。结果表明,使用非局部描述符DOC与局部序列谱相结合可显著提高预测准确率。这些改进表明,采用适当缩放方案的DOC是预测二硫键连接的有效特征。本研究开发的方法可在网络服务器PreCys(蛋白质半胱氨酸-半胱氨酸连接预测)上获取。