Aarhus University, Department of Genetics and Biotechnology, Slagelse, Denmark.
Comput Biol Chem. 2011 Apr;35(2):57-61. doi: 10.1016/j.compbiolchem.2011.01.002. Epub 2011 Feb 2.
N-glycosylation is a common protein modification process, which affects a number of properties of proteins. Little is known about the distribution of N-glycosylation sequons, for example, the distance between glycosylated sites and their position in the protein primary sequence. Using a large set of experimentally confirmed eukaryotic N-glycoproteins we analyzed the relative position and distribution of sequons. N-Glycosylation probability was found to be lower in the termini of protein sequences compared to the mid region. N-glycosylated sequons were found much farther from C terminus compared to the N-terminus of the protein sequence and this effect was more pronounced for NXS sequons. The distribution of sequons, modeled based on balls-in-boxes classical occupancy, showed a near-maximum probability. Considerable proportion of sequons was found within a distance of ten amino acids, indicating that the steric hindrance was not a key factor in protein N-glycosylation. Interestingly, the distribution of all sequons present in N-glycoproteins showed a pattern very similar to that of glycosylated sequons. The results indicate that protein N-glycosylation chiefly follows a random design.
N-糖基化是一种常见的蛋白质修饰过程,影响蛋白质的许多性质。关于 N-糖基化序列的分布,例如糖基化位点之间的距离及其在蛋白质一级序列中的位置,人们知之甚少。我们使用大量经过实验证实的真核 N-糖蛋白分析了序列的相对位置和分布。与蛋白质序列的中间区域相比,N-糖基化序列在蛋白质序列的末端的概率较低。与蛋白质序列的 N 端相比,N-糖基化序列离 C 端更远,而对于 NXS 序列,这种效应更为明显。基于球盒经典占据的模型,对序列的分布进行了模拟,结果显示出接近最大值的概率。在距离十个氨基酸的范围内发现了相当比例的序列,这表明空间位阻不是蛋白质 N-糖基化的关键因素。有趣的是,所有存在于 N-糖蛋白中的序列的分布模式与糖基化序列非常相似。结果表明,蛋白质 N-糖基化主要遵循随机设计。