Fonseca Nuno A, Camacho Rui, Magalhães A L
IBMC and LIACC, R. Campo Alegre, 1021/1055, 4169-007 Porto, Portugal.
Proteins. 2008 Jan 1;70(1):188-96. doi: 10.1002/prot.21525.
A systematic survey was carried out in an unbiased sample of 815 protein chains with a maximum of 20% homology selected from the Protein Data Bank, whose structures were solved at a resolution higher than 1.6 A and with a R-factor lower than 25%. A set of 5556 subsequences with alpha-helix or 3(10)-helix motifs was extracted from the protein chains considered. Global and local propensities were then calculated for all possible amino acid pairs of the type (i, i + 1), (i, i + 2), (i, i + 3), and (i, i + 4), starting at the relevant helical positions N1, N2, N3, C3, C2, C1, and N-int (interior positions), and also at the first nonhelical positions in both termini of the helices, namely, N-cap and C-cap. The statistical analysis of the propensity values has shown that pairing is significantly dependent on the type of the amino acids and on the position of the pair. A few sequences of three and four amino acids were selected and their high prevalence in helices is outlined in this work. The Glu-Lys-Tyr-Pro sequence shows a peculiar distribution in proteins, which may suggest a relevant structural role in alpha-helices when Pro is located at the C-cap position. A bioinformatics tool was developed, which updates automatically and periodically the results and makes them available in a web site.
在从蛋白质数据库中选取的815条蛋白质链的无偏样本中进行了系统调查,这些蛋白质链的同源性最高为20%,其结构解析分辨率高于1.6 Å,R因子低于25%。从所考虑的蛋白质链中提取了一组5556个具有α-螺旋或3(10)-螺旋基序的子序列。然后从相关螺旋位置N1、N2、N3、C3、C2、C1和N-int(内部位置)开始,以及在螺旋两端的第一个非螺旋位置,即N-帽和C-帽,计算类型为(i, i + 1)、(i, i + 2)、(i, i + 3)和(i, i + 4)的所有可能氨基酸对的全局和局部倾向。倾向值的统计分析表明,配对显著依赖于氨基酸类型和配对位置。选择了一些由三个和四个氨基酸组成的序列,并概述了它们在螺旋中的高出现率。Glu-Lys-Tyr-Pro序列在蛋白质中呈现出特殊分布,当Pro位于C-帽位置时,这可能表明其在α-螺旋中具有相关的结构作用。开发了一种生物信息学工具,该工具会自动定期更新结果并在网站上提供这些结果。