Kandaswamy Krishna Kumar, Pugalenthi Ganesan, Suganthan P N, Gangal Rajeev
Institute for Neuro- and Bioinformatics, University of Lübeck, 23538 Lübeck, Germany.
Protein Pept Lett. 2010 Apr;17(4):423-30. doi: 10.2174/092986610790963726.
X-ray crystallography is the most widely used method for protein 3-dimensional structure determination. Selection of target protein that can yield high quality crystal for X-ray crystallography is a challenging task. Prediction of protein crystallization propensity from sequence information is useful for the selection of target protein for crystallization. Recently, support vector machines have been widely used to solve various biological problems. In this work, we present a SVMCRYS method which use support vector machine to classify protein sequence into 'amenable to crystallization' and 'resistant to crystallization'. SVMCRYS was trained on a dataset containing 728 sequences that gave diffraction quality crystal and 728 sequences where work had been stopped before obtaining crystal. The performance of SVMCRYS method was compared with other sequence-based crystallization prediction methods such as SECRET, CRYSTALP, OB-Score, ParCrys and XtalPred using three different datasets. SVMCRYS achieved better prediction rate with higher sensitivity and specificity. Our analysis suggests that SVMCRYS can be used to predict proteins which are amenable to crystallization and proteins which are difficult for crystallization. The SVMCRYS software, dataset and feature set can be obtained from http://www3.ntu.edu.sg/home/EPNSugan/index_files/svmcrys.htm.
X射线晶体学是用于确定蛋白质三维结构的最广泛使用的方法。选择能够产生高质量晶体用于X射线晶体学的目标蛋白质是一项具有挑战性的任务。从序列信息预测蛋白质结晶倾向对于选择用于结晶的目标蛋白质很有用。最近,支持向量机已被广泛用于解决各种生物学问题。在这项工作中,我们提出了一种SVMCRYS方法,该方法使用支持向量机将蛋白质序列分类为“易于结晶”和“抗结晶”。SVMCRYS在一个数据集上进行训练,该数据集包含728个给出衍射质量晶体的序列和728个在获得晶体之前工作已停止的序列。使用三个不同的数据集,将SVMCRYS方法的性能与其他基于序列的结晶预测方法(如SECRET、CRYSTALP、OB-Score、ParCrys和XtalPred)进行了比较。SVMCRYS实现了更好的预测率,具有更高的灵敏度和特异性。我们的分析表明,SVMCRYS可用于预测易于结晶的蛋白质和难以结晶的蛋白质。SVMCRYS软件、数据集和特征集可从http://www3.ntu.edu.sg/home/EPNSugan/index_files/svmcrys.htm获得。