College of Marine Life Science, Ocean University of China, Yushan Road, Qingdao 266003, PR China.
Biochimie. 2013 Sep;95(9):1741-4. doi: 10.1016/j.biochi.2013.05.017. Epub 2013 Jun 14.
In this study, a 12-dimensional feature vector is constructed to reflect the general contents and spatial arrangements of the secondary structural elements of a given protein sequence. Among the 12 features, 6 novel features are specially designed to improve the prediction accuracies for α/β and α + β classes based on the distributions of α-helices and β-strands and the characteristics of parallel β-sheets and anti-parallel β-sheets. To evaluate our method, the jackknife cross-validating test is employed on two widely-used datasets, 25PDB and 1189 datasets with sequence similarity lower than 40% and 25%, respectively. The performance of our method outperforms the recently reported methods in most cases, and the 6 newly-designed features have significant positive effect to the prediction accuracies, especially for α/β and α + β classes.
在这项研究中,构建了一个 12 维特征向量,以反映给定蛋白质序列中二级结构元件的一般内容和空间排列。在这 12 个特征中,有 6 个新特征是专门设计的,基于α-螺旋和β-折叠的分布以及平行β-折叠和反平行β-折叠的特征,以提高对α/β和α+β类的预测准确性。为了评估我们的方法,在两个广泛使用的数据集 25PDB 和 1189 上进行了自举交叉验证测试,这两个数据集的序列相似性分别低于 40%和 25%。在大多数情况下,我们的方法的性能优于最近报道的方法,并且这 6 个新设计的特征对预测准确性有显著的积极影响,特别是对于α/β和α+β类。