Kim Hyunsoo, Park Haesun
Department of Computer Science and Engineering, University of Minnesota, Minneapolis, Minnesota 55455, USA.
Proteins. 2004 Feb 15;54(3):557-62. doi: 10.1002/prot.10602.
The prediction of protein relative solvent accessibility gives us helpful information for the prediction of tertiary structure of a protein. The SVMpsi method, which uses support vector machines (SVMs), and the position-specific scoring matrix (PSSM) generated from PSI-BLAST have been applied to achieve better prediction accuracy of the relative solvent accessibility. We have introduced a three-dimensional local descriptor that contains information about the expected remote contacts by both the long-range interaction matrix and neighbor sequences. Moreover, we applied feature weights to kernels in SVMs in order to consider the degree of significance that depends on the distance from the specific amino acid. Relative solvent accessibility based on a two state-model, for 25%, 16%, 5%, and 0% accessibility are predicted at 78.7%, 80.7%, 82.4%, and 87.4% accuracy, respectively. Three-state prediction results provide a 64.5% accuracy with 9%; 36% threshold. The support vector machine approach has successfully been applied for solvent accessibility prediction by considering long-range interaction and handling unbalanced data.
蛋白质相对溶剂可及性的预测为蛋白质三级结构的预测提供了有用信息。使用支持向量机(SVM)的SVMpsi方法以及由PSI-BLAST生成的位置特异性得分矩阵(PSSM)已被应用于实现相对溶剂可及性的更好预测准确性。我们引入了一种三维局部描述符,它通过长程相互作用矩阵和相邻序列包含有关预期远程接触的信息。此外,我们将特征权重应用于支持向量机中的核,以便考虑取决于与特定氨基酸距离的显著程度。基于二态模型,对于25%、16%、5%和0%的可及性,相对溶剂可及性的预测准确率分别为78.7%、80.7%、82.4%和87.4%。三态预测结果在9%;36%阈值下的准确率为64.5%。支持向量机方法通过考虑长程相互作用和处理不平衡数据,已成功应用于溶剂可及性预测。