Department of Chemistry, Tongji University, Shanghai, China.
PLoS One. 2012;7(11):e48389. doi: 10.1371/journal.pone.0048389. Epub 2012 Nov 7.
Turns are a critical element of the structure of a protein; turns play a crucial role in loops, folds, and interactions. Current prediction methods are well developed for the prediction of individual turn types, including α-turn, β-turn, and γ-turn, etc. However, for further protein structure and function prediction it is necessary to develop a uniform model that can accurately predict all types of turns simultaneously.
In this study, we present a novel approach, TurnP, which offers the ability to investigate all the turns in a protein based on a unified model. The main characteristics of TurnP are: (i) using newly exploited features of structural evolution information (secondary structure and shape string of protein) based on structure homologies, (ii) considering all types of turns in a unified model, and (iii) practical capability of accurate prediction of all turns simultaneously for a query. TurnP utilizes predicted secondary structures and predicted shape strings, both of which have greater accuracy, based on innovative technologies which were both developed by our group. Then, sequence and structural evolution features, which are profile of sequence, profile of secondary structures and profile of shape strings are generated by sequence and structure alignment. When TurnP was validated on a non-redundant dataset (4,107 entries) by five-fold cross-validation, we achieved an accuracy of 88.8% and a sensitivity of 71.8%, which exceeded the most state-of-the-art predictors of certain type of turn. Newly determined sequences, the EVA and CASP9 datasets were used as independent tests and the results we achieved were outstanding for turn predictions and confirmed the good performance of TurnP for practical applications.
转折是蛋白质结构的一个关键要素;转折在环、折叠和相互作用中起着至关重要的作用。目前,对于单个转折类型的预测方法已经得到了很好的发展,包括α转折、β转折和γ转折等。然而,为了进一步进行蛋白质结构和功能预测,有必要开发一种能够同时准确预测所有类型转折的统一模型。
在这项研究中,我们提出了一种新的方法 TurnP,它提供了基于统一模型研究蛋白质中所有转折的能力。TurnP 的主要特点是:(i)利用结构同源性基础上的结构进化信息(二级结构和蛋白质形状字符串)的新开发特征,(ii)在统一模型中考虑所有类型的转折,以及(iii)对查询同时准确预测所有转折的实用能力。TurnP 利用了我们小组开发的创新技术预测的更准确的二级结构和预测的形状字符串。然后,通过序列和结构比对生成序列和结构进化特征,包括序列轮廓、二级结构轮廓和形状字符串轮廓。通过五重交叉验证在非冗余数据集(4107 个条目)上验证 TurnP 时,我们达到了 88.8%的准确率和 71.8%的敏感性,超过了某些类型转折的最先进预测器。新确定的序列、EVA 和 CASP9 数据集被用作独立测试,我们在转折预测方面的结果非常出色,证实了 TurnP 对实际应用的良好性能。