Institute of Molecular Life Sciences, University of Zurich, Switzerland.
Proteomics. 2010 Mar;10(6):1297-306. doi: 10.1002/pmic.200900414.
Genome-wide, absolute quantification of expressed proteins is not yet within reach for most eukaryotes. However, large numbers of MS-based protein identifications have been deposited in databases, together with information on the observation frequencies of each peptide spectrum ("spectral counts"). We have conducted a meta-analysis using several million peptide observations from five model eukaryotes, establishing a consistent, semi-quantitative analysis pipeline. By inferring and comparing protein abundances across orthologs, we observe: (i) the accuracy of spectral counting predictions increases with sampling depth and can rival that of direct biochemical measurements, (ii) the quantitative makeup of the consistently observed core proteome in eukaryotes is remarkably stable, with abundance correlations exceeding R(S)=0.7 at an evolutionary distance greater than 1000 million years, and (iii) some groups of proteins are more constrained than others. We argue that our observations reveal stabilizing selection: central parts of the eukaryotic proteome appear to be expressed at well-balanced, near-optimal abundance levels. This is consistent with our further observations that essential proteins show lower abundance variations than non-essential proteins, and that gene families that tend to undergo gene duplications are less well constrained than families that keep a single-copy status.
目前,大多数真核生物还无法实现基因组范围内表达蛋白的绝对定量。然而,已经有大量基于质谱的蛋白质鉴定信息被存入数据库,其中包括每个肽段谱(“谱计数”)的观测频率信息。我们使用来自 5 种模式真核生物的数百万个肽段观测数据进行了荟萃分析,建立了一个一致的半定量分析流程。通过推断和比较直系同源物中的蛋白质丰度,我们观察到:(i)谱计数预测的准确性随着采样深度的增加而提高,其准确度可与直接生化测量相媲美;(ii)在进化距离超过 10 亿年的情况下,真核生物中一致观察到的核心蛋白质组的定量组成非常稳定,丰度相关性超过 R(S)=0.7;(iii)某些蛋白质组比其他蛋白质组更受限制。我们认为,我们的观察结果揭示了稳定选择:真核生物蛋白质组的核心部分似乎以平衡且接近最佳的丰度水平表达。这与我们的进一步观察结果一致,即必需蛋白的丰度变化比非必需蛋白小,并且倾向于发生基因复制的基因家族比保持单拷贝状态的家族受到的限制更小。