Oldfield Christopher J, Cheng Yugong, Cortese Marc S, Brown Celeste J, Uversky Vladimir N, Dunker A Keith
Molecular Kinetics, Inc., 6201 La Pas Trail, Suite 160, Indianapolis, Indiana 46268, USA.
Biochemistry. 2005 Feb 15;44(6):1989-2000. doi: 10.1021/bi047993o.
Intrinsically disordered proteins and regions carry out varied and vital cellular functions. Proteins with disordered regions are especially common in eukaryotic cells, with a subset of these proteins being mostly disordered, e.g., with more disordered than ordered residues. Two distinct methods have been previously described for using amino acid sequences to predict which proteins are likely to be mostly disordered. These methods are based on the net charge-hydropathy distribution and disorder prediction score distribution. Each of these methods is reexamined, and the prediction results are compared herein. A new prediction method based on consensus is described. Application of the consensus method to whole genomes reveals that approximately 4.5% of Yersinia pestis, 5% of Escherichia coli K12, 6% of Archaeoglobus fulgidus, 8% of Methanobacterium thermoautotrophicum, 23% of Arabidopsis thaliana, and 28% of Mus musculus proteins are mostly disordered. The unexpectedly high frequency of mostly disordered proteins in eukaryotes has important implications both for large-scale, high-throughput projects and also for focused experiments aimed at determination of protein structure and function.
内在无序蛋白质及区域执行着多样且重要的细胞功能。具有无序区域的蛋白质在真核细胞中尤为常见,其中一部分蛋白质大多处于无序状态,例如,无序残基多于有序残基。此前已描述了两种利用氨基酸序列预测哪些蛋白质可能大多处于无序状态的不同方法。这些方法基于净电荷-亲水性分布和无序预测得分分布。本文对每种方法进行了重新审视,并比较了预测结果。描述了一种基于共识的新预测方法。将该共识方法应用于全基因组分析发现,鼠疫耶尔森菌约4.5%的蛋白质、大肠杆菌K12约5%的蛋白质、嗜热栖热菌约6%的蛋白质、嗜热自养甲烷杆菌约8%的蛋白质、拟南芥约23%的蛋白质以及小家鼠约28%的蛋白质大多处于无序状态。真核生物中大多处于无序状态的蛋白质出现频率意外之高,这对大规模、高通量项目以及旨在确定蛋白质结构和功能的重点实验均具有重要意义。