Centro de Biología Molecular Severo Ochoa, CSIC-UAM , 28049 Madrid, Spain.
J Proteome Res. 2014 Mar 7;13(3):1234-47. doi: 10.1021/pr4006958. Epub 2014 Feb 10.
The combination of stable isotope labeling (SIL) with mass spectrometry (MS) allows comparison of the abundance of thousands of proteins in complex mixtures. However, interpretation of the large data sets generated by these techniques remains a challenge because appropriate statistical standards are lacking. Here, we present a generally applicable model that accurately explains the behavior of data obtained using current SIL approaches, including (18)O, iTRAQ, and SILAC labeling, and different MS instruments. The model decomposes the total technical variance into the spectral, peptide, and protein variance components, and its general validity was demonstrated by confronting 48 experimental distributions against 18 different null hypotheses. In addition to its general applicability, the performance of the algorithm was at least similar than that of other existing methods. The model also provides a general framework to integrate quantitative and error information fully, allowing a comparative analysis of the results obtained from different SIL experiments. The model was applied to the global analysis of protein alterations induced by low H₂O₂ concentrations in yeast, demonstrating the increased statistical power that may be achieved by rigorous data integration. Our results highlight the importance of establishing an adequate and validated statistical framework for the analysis of high-throughput data.
稳定同位素标记(SIL)与质谱(MS)的结合允许比较复杂混合物中数千种蛋白质的丰度。然而,由于缺乏适当的统计标准,这些技术产生的大量数据集的解释仍然是一个挑战。在这里,我们提出了一个普遍适用的模型,该模型可以准确地解释当前 SIL 方法(包括(18)O、iTRAQ 和 SILAC 标记以及不同的 MS 仪器)获得的数据的行为。该模型将总技术方差分解为光谱、肽和蛋白质方差分量,通过将 48 个实验分布与 18 个不同的零假设进行对比,证明了该模型的普遍有效性。除了普遍适用性之外,该算法的性能至少与其他现有方法相似。该模型还提供了一个全面的框架来充分整合定量和误差信息,从而可以对来自不同 SIL 实验的结果进行比较分析。该模型应用于酵母中低 H₂O₂浓度诱导的蛋白质变化的全局分析,证明了通过严格的数据整合可以实现更高的统计功效。我们的研究结果强调了为高通量数据的分析建立适当和验证的统计框架的重要性。