Kaur Harpreet, Raghava G P S
Institute of Microbial Technology, Sector 39A, Chandigarh, India.
Protein Sci. 2003 May;12(5):923-9. doi: 10.1110/ps.0241703.
In the present study, an attempt has been made to develop a method for predicting gamma-turns in proteins. First, we have implemented the commonly used statistical and machine-learning techniques in the field of protein structure prediction, for the prediction of gamma-turns. All the methods have been trained and tested on a set of 320 nonhomologous protein chains by a fivefold cross-validation technique. It has been observed that the performance of all methods is very poor, having a Matthew's Correlation Coefficient (MCC) </= 0.06. Second, predicted secondary structure obtained from PSIPRED is used in gamma-turn prediction. It has been found that machine-learning methods outperform statistical methods and achieve an MCC of 0.11 when secondary structure information is used. The performance of gamma-turn prediction is further improved when multiple sequence alignment is used as the input instead of a single sequence. Based on this study, we have developed a method, GammaPred, for gamma-turn prediction (MCC = 0.17). The GammaPred is a neural-network-based method, which predicts gamma-turns in two steps. In the first step, a sequence-to-structure network is used to predict the gamma-turns from multiple alignment of protein sequence. In the second step, it uses a structure-to-structure network in which input consists of predicted gamma-turns obtained from the first step and predicted secondary structure obtained from PSIPRED.
在本研究中,已尝试开发一种预测蛋白质中γ转角的方法。首先,我们在蛋白质结构预测领域实施了常用的统计和机器学习技术,用于预测γ转角。所有方法均通过五重交叉验证技术在一组320条非同源蛋白质链上进行训练和测试。据观察,所有方法的性能都很差,马修斯相关系数(MCC)≤0.06。其次,将从PSIPRED获得的预测二级结构用于γ转角预测。已发现,当使用二级结构信息时,机器学习方法优于统计方法,MCC达到0.11。当使用多序列比对作为输入而非单序列时,γ转角预测的性能进一步提高。基于这项研究,我们开发了一种用于γ转角预测的方法GammaPred(MCC = 0.17)。GammaPred是一种基于神经网络的方法,它分两步预测γ转角。第一步,使用序列到结构网络从蛋白质序列的多序列比对中预测γ转角。第二步,它使用结构到结构网络,其输入包括从第一步获得的预测γ转角和从PSIPRED获得的预测二级结构。