Science for Life Laboratory, Stockholm University, Solna, Sweden.
Department of Biochemistry and Biophysics, Stockholm University, Stockholm, Sweden.
PLoS Comput Biol. 2019 Jul 22;15(7):e1007186. doi: 10.1371/journal.pcbi.1007186. eCollection 2019 Jul.
Intrinsic disorder is more abundant in eukaryotic than prokaryotic proteins. Methods predicting intrinsic disorder are based on the amino acid sequence of a protein. Therefore, there must exist an underlying difference in the sequences between eukaryotic and prokaryotic proteins causing the (predicted) difference in intrinsic disorder. By comparing proteins, from complete eukaryotic and prokaryotic proteomes, we show that the difference in intrinsic disorder emerges from the linker regions connecting Pfam domains. Eukaryotic proteins have more extended linker regions, and in addition, the eukaryotic linkers are significantly more disordered, 38% vs. 12-16% disordered residues. Next, we examined the underlying reason for the increase in disorder in eukaryotic linkers, and we found that the changes in abundance of only three amino acids cause the increase. Eukaryotic proteins contain 8.6% serine; while prokaryotic proteins have 6.5%, eukaryotic proteins also contain 5.4% proline and 5.3% isoleucine compared with 4.0% proline and ≈ 7.5% isoleucine in the prokaryotes. All these three differences contribute to the increased disorder in eukaryotic proteins. It is tempting to speculate that the increase in serine frequencies in eukaryotes is related to regulation by kinases, but direct evidence for this is lacking. The differences are observed in all phyla, protein families, structural regions and type of protein but are most pronounced in disordered and linker regions. The observation that differences in the abundance of three amino acids cause the difference in disorder between eukaryotic and prokaryotic proteins raises the question: Are amino acid frequencies different in eukaryotic linkers because the linkers are more disordered or do the differences cause the increased disorder?
天然无序在真核生物蛋白中比原核生物蛋白更为丰富。预测天然无序的方法基于蛋白质的氨基酸序列。因此,真核生物和原核生物蛋白的序列中必然存在潜在差异,导致(预测的)天然无序差异。通过比较来自完整的真核生物和原核生物蛋白质组的蛋白质,我们表明,天然无序的差异源于连接 Pfam 结构域的连接区。真核生物蛋白具有更长的连接区,并且,真核生物的连接区显著更无序,38%对 12-16%无序残基。接下来,我们研究了真核生物连接区无序度增加的潜在原因,发现仅三种氨基酸丰度的变化导致了无序度的增加。真核生物蛋白含有 8.6%的丝氨酸;而原核生物蛋白含有 6.5%,与原核生物中 4.0%的脯氨酸和 ≈ 7.5%的异亮氨酸相比,真核生物蛋白还含有 5.4%的脯氨酸和 5.3%的异亮氨酸。所有这三个差异都导致了真核生物蛋白无序度的增加。推测真核生物中丝氨酸丰度的增加与激酶的调节有关,但缺乏直接证据。这些差异在所有门、蛋白质家族、结构区域和蛋白质类型中都观察到,但在无序区和连接区最为明显。观察到的三个氨基酸丰度差异导致真核生物和原核生物蛋白之间无序度的差异,这提出了一个问题:真核生物连接区的氨基酸频率差异是否是由于连接区更无序,还是这些差异导致了无序度的增加?