Karlin S, Burge C
Department of Mathematics, Stanford University, CA 94305-2125, USA.
Proc Natl Acad Sci U S A. 1996 Feb 20;93(4):1560-5. doi: 10.1073/pnas.93.4.1560.
Several human neurological disorders are associated with proteins containing abnormally long runs of glutamine residues. Strikingly, most of these proteins contain two or more additional long runs of amino acids other than glutamine. We screened the current human, mouse, Drosophila, yeast, and Escherichia coli protein sequence data bases and identified all proteins containing multiple long homopeptides. This search found multiple long homopeptides in about 12% of Drosophila proteins but in only about 1.7% of human, mouse, and yeast proteins and none among E. coli proteins. Most of these sequences show other unusual sequence features, including multiple charge clusters and excessive counts of homopeptides of length > or = two amino acid residues. Intriguingly, a large majority of the identified Drosophila proteins are essential developmental proteins and, in particular, most play a role in central nervous system development. Almost half of the human and mouse proteins identified are homeotic homologs. The role of long homopeptides in fine-tuning protein conformation for multiple functional activities is discussed. The relative contributions of strand slippage and of dynamic mutation are also addressed. Several new experiments are proposed.
几种人类神经疾病与含有异常长的谷氨酰胺残基序列的蛋白质有关。引人注目的是,这些蛋白质中的大多数除了谷氨酰胺外还含有两个或更多其他额外的长氨基酸序列。我们筛选了当前的人类、小鼠、果蝇、酵母和大肠杆菌蛋白质序列数据库,并鉴定出所有含有多个长同肽的蛋白质。这项搜索在约12%的果蝇蛋白质中发现了多个长同肽,但在人类、小鼠和酵母蛋白质中仅约1.7%含有,而在大肠杆菌蛋白质中未发现。这些序列中的大多数还表现出其他不寻常的序列特征,包括多个电荷簇以及长度≥两个氨基酸残基的同肽数量过多。有趣的是,绝大多数已鉴定的果蝇蛋白质是重要的发育蛋白,特别是其中大多数在中枢神经系统发育中起作用。已鉴定的人类和小鼠蛋白质中近一半是同源异型同源物。本文讨论了长同肽在微调蛋白质构象以实现多种功能活动中的作用。还探讨了链滑动和动态突变的相对贡献。提出了几个新的实验。