Aharonovsky E, Trifonov E N
Genome Diversity Center, Institute of Evolution, University of Haifa, Haifa 31905, Israel.
J Biomol Struct Dyn. 2005 Dec;23(3):237-42. doi: 10.1080/07391102.2005.10507062.
Conserved protein sequence segments are commonly believed to correspond to functional sites in the protein sequence. A novel approach is proposed to profile the changing degree of conservation along the protein sequence, by evaluating the occurrence frequencies of all short oligopeptides of the given sequence in a large proteome database. Thus, a protein sequence conservation profile can be plotted for every protein. The profile indicates where along the sequences the potential functional (conserved) sites are located. The corresponding oligopeptides belonging to the sites are very frequent across many prokaryotic species. Analysis of a representative set of such profiles reveals a common feature of all examined proteins: they consist of sequence modules represented by the peaks of conservation. Typical size of the modules (peak-to-peak distance) is 25-30 amino acid residues.
保守的蛋白质序列片段通常被认为对应于蛋白质序列中的功能位点。本文提出了一种新方法,通过评估给定序列的所有短寡肽在大型蛋白质组数据库中的出现频率,来描绘沿蛋白质序列的保守程度变化。因此,可以为每个蛋白质绘制蛋白质序列保守性图谱。该图谱表明潜在的功能(保守)位点在序列中的位置。属于这些位点的相应寡肽在许多原核生物物种中非常常见。对一组具有代表性的此类图谱的分析揭示了所有被检测蛋白质的一个共同特征:它们由以保守峰表示的序列模块组成。模块的典型大小(峰到峰距离)为25 - 30个氨基酸残基。