Department of Computer Science and Engineering, Indian Institute of Technology Madras, Chennai 600 036, India.
BMC Bioinformatics. 2011 Feb 15;12 Suppl 1(Suppl 1):S20. doi: 10.1186/1471-2105-12-S1-S20.
The structure conservation in various α-helix subclasses reveals the sequence and context dependent factors causing distortions in the α-helix. The sequence-structure relationship in these subclasses can be used to predict structural variations in α-helix purely based on its sequence. We train support vector machine(SVM) with dot product kernel function to discriminate between regular α-helix and non-regular α-helices purely based on the sequences, which are represented with various overall and position specific propensities of amino acids.
We characterize the structural distortions in five α-helix subclasses. The sequence structure correlation in the subclasses reveals that the increased propensity of proline, histidine, serine, aspartic acid and aromatic amino acids are responsible for the distortions in regular α-helix. The N-terminus of regular α-helix prefers neutral and acidic polar amino acids, while the C-terminus prefers basic polar amino acid. Proline is preferred in the first turn of regular α-helix, while it is preferred to produce kinked and curved subclasses. The SVM discriminates between regular α-helix and the rest with precision of 80.97% and recall of 88.05%.
The correlation between structural variation in helices and their sequences is manifested by the performance of SVM based on sequence features. The results presented here are useful for computational design of helices. The results are also useful for prediction of structural perturbations in helix sequence purely based on its sequence.
各种α-螺旋亚类的结构保守性揭示了导致α-螺旋扭曲的序列和上下文相关因素。这些亚类中的序列-结构关系可用于仅基于序列预测α-螺旋的结构变化。我们使用带有内积核函数的支持向量机(SVM)来区分规则α-螺旋和不规则α-螺旋,这些序列使用各种总体和位置特异性氨基酸倾向来表示。
我们描述了五个α-螺旋亚类的结构扭曲。亚类中的序列结构相关性表明,脯氨酸、组氨酸、丝氨酸、天冬氨酸和芳香族氨基酸的倾向增加是导致规则α-螺旋扭曲的原因。规则α-螺旋的 N 端偏爱中性和酸性极性氨基酸,而 C 端偏爱碱性极性氨基酸。脯氨酸优先出现在规则α-螺旋的第一转,而它优先产生扭曲和弯曲的亚类。SVM 区分规则α-螺旋和其余的精度为 80.97%,召回率为 88.05%。
基于序列特征的 SVM 的性能表明了螺旋结构变化与其序列之间的相关性。这里呈现的结果可用于螺旋的计算设计。这些结果也可用于仅基于序列预测螺旋序列的结构扰动。