Zimmermann Olav, Hansmann Ulrich H E
John von Neumann Institut für Computing, Research Centre Jülich, 52425 Jülich, Germany.
J Chem Inf Model. 2008 Sep;48(9):1903-8. doi: 10.1021/ci800178a. Epub 2008 Sep 3.
Constraint generation for 3d structure prediction and structure-based database searches benefit from fine-grained prediction of local structure. In this work, we present LOCUSTRA, a novel scheme for the multiclass prediction of local structure that uses two layers of support vector machines (SVM). Using a 16-letter structural alphabet from de Brevern et al. (Proteins: Struct., Funct., Bioinf. 2000, 41, 271-287), we assess its prediction ability for an independent test set of 222 proteins and compare our method to three-class secondary structure prediction and direct prediction of dihedral angles. The prediction accuracy is Q16=61.0% for the 16 classes of the structural alphabet and Q3=79.2% for a simple mapping to the three secondary classes helix, sheet, and coil. We achieve a mean phi(psi) error of 24.74 degrees (38.35 degrees) and a median RMSDA (root-mean-square deviation of the (dihedral) angles) per protein chain of 52.1 degrees. These results compare favorably with related approaches. The LOCUSTRA web server is freely available to researchers at http://www.fz-juelich.de/nic/cbb/service/service.php.
用于三维结构预测和基于结构的数据库搜索的约束生成受益于局部结构的细粒度预测。在这项工作中,我们提出了LOCUSTRA,一种用于局部结构多类预测的新方案,它使用两层支持向量机(SVM)。使用来自德布雷弗恩等人(《蛋白质:结构、功能、生物信息学》,2000年,41卷,271 - 287页)的16字母结构字母表,我们评估了它对222种蛋白质的独立测试集的预测能力,并将我们的方法与三类二级结构预测和二面角直接预测进行了比较。对于结构字母表的16类,预测准确率为Q16 = 61.0%,对于简单映射到螺旋、片层和卷曲这三个二级类别,Q3 = 79.2%。我们实现了平均phi(psi)误差为24.74度(38.35度),每条蛋白质链的中位RMSDA((二面角)角度的均方根偏差)为52.1度。这些结果与相关方法相比具有优势。LOCUSTRA网络服务器可供研究人员在http://www.fz-juelich.de/nic/cbb/service/service.php免费使用。