Xia Xuhua, Xie Zheng
Bioinformatics Laboratory, HKU-Pasteur Research Center, Dexter H.C. Man Building, 8 Sassoon Road, Pokfulam, Hong Kong.
Mol Biol Evol. 2002 Jan;19(1):58-67. doi: 10.1093/oxfordjournals.molbev.a003982.
Amino acids interact with each other, especially with neighboring amino acids, to generate protein structures. We studied the pattern of association and repulsion of amino acids based on 24,748 protein-coding genes from human, 11,321 from mouse, and 15,028 from Escherichia coli, and documented the pattern of neighbor preference of amino acids. All amino acids have different preferences for neighbors. We have also analyzed 7,342 proteins with known secondary structure and estimated the propensity of the 20 amino acids occurring in three of the major secondary structures, i.e., helices, sheets, and turns. Much of the neighbor preference can be explained by the propensity of the amino acids in forming different secondary structures, but there are also a number of intriguing association and repulsion patterns. The similarity in neighbor preference among amino acids is significantly correlated with the number of amino acid substitutions in both mitochondrial and nuclear genes, with amino acids having similar sets of neighbors replacing each other more frequently than those having very different sets of neighbors. This similarity in neighbor preference is incorporated into a new index of amino acid dissimilarities that can predict nonsynonymous codon substitutions better than the two existing indices of amino acid dissimilarities, i.e., Grantham's and Miyata's distances.
氨基酸相互作用,特别是与相邻氨基酸相互作用,以生成蛋白质结构。我们基于来自人类的24748个蛋白质编码基因、来自小鼠的11321个蛋白质编码基因以及来自大肠杆菌的15028个蛋白质编码基因,研究了氨基酸的缔合和排斥模式,并记录了氨基酸的邻位偏好模式。所有氨基酸对相邻氨基酸都有不同的偏好。我们还分析了7342个具有已知二级结构的蛋白质,并估计了20种氨基酸出现在三种主要二级结构(即螺旋、折叠片和转角)中的倾向。大部分邻位偏好可以通过氨基酸形成不同二级结构的倾向来解释,但也存在许多有趣的缔合和排斥模式。氨基酸之间邻位偏好的相似性与线粒体和核基因中氨基酸替换的数量显著相关,具有相似邻位组的氨基酸比具有非常不同邻位组的氨基酸更频繁地相互替换。这种邻位偏好的相似性被纳入一个新的氨基酸差异指数中,该指数比现有的两种氨基酸差异指数(即格兰瑟姆距离和宫田距离)能更好地预测非同义密码子替换。