Ghadimi Mahin, Heshmati Emran, Khalifeh Khosrow
Department of Biology, Faculty of Science, University of Zanjan, University Blvd, Zanjan, Islamic Republic of Iran.
Eur Biophys J. 2018 Jan;47(1):31-38. doi: 10.1007/s00249-017-1226-6. Epub 2017 Jun 13.
Finding any regularity in the sequences of proteins and determining their correlation with structural features are of great interest for an understanding of molecular biology. We statistically analyzed the relative frequencies of all 400 possible dipeptides in a data set containing randomly selected proteins of different defined structural classes including all-alpha, all-beta, alpha + beta and alpha/beta families. We found that the distribution of dipeptides is not the same for different structural classes, and some of them are significantly far from a random distribution. A tendency of a given amino acid to localize in the first or second position of a dipeptide depending on the structural class of protein was also found. Interestingly, some amino acids may be substituted for each other in the first or second positions of specific dipeptides in each structural class. This finding apparently contrasts with the routine expectation from the viewpoint of amino acid properties, as classically understood.
在蛋白质序列中寻找任何规律并确定它们与结构特征的相关性,对于理解分子生物学具有极大的意义。我们对一个数据集里所有400种可能的二肽的相对频率进行了统计分析,该数据集包含随机选择的、来自不同定义结构类别的蛋白质,包括全α、全β、α + β和α/β家族。我们发现,不同结构类别的二肽分布并不相同,其中一些明显偏离随机分布。还发现了特定氨基酸根据蛋白质结构类别倾向于定位在二肽的第一位或第二位的趋势。有趣的是,在每个结构类别的特定二肽的第一位或第二位,一些氨基酸可能相互替代。从经典理解的氨基酸性质的角度来看,这一发现显然与常规预期形成对比。