Pacific Northwest National Laboratory, USA.
Proteomics. 2011 Dec;11(24):4736-41. doi: 10.1002/pmic.201100078. Epub 2011 Nov 17.
Quantification of LC-MS peak intensities assigned during peptide identification in a typical comparative proteomics experiment will deviate from run-to-run of the instrument due to both technical and biological variation. Thus, normalization of peak intensities across an LC-MS proteomics dataset is a fundamental step in pre-processing. However, the downstream analysis of LC-MS proteomics data can be dramatically affected by the normalization method selected. Current normalization procedures for LC-MS proteomics data are presented in the context of normalization values derived from subsets of the full collection of identified peptides. The distribution of these normalization values is unknown a priori. If they are not independent from the biological factors associated with the experiment the normalization process can introduce bias into the data, possibly affecting downstream statistical biomarker discovery. We present a novel approach to evaluate normalization strategies, which includes the peptide selection component associated with the derivation of normalization values. Our approach evaluates the effect of normalization on the between-group variance structure in order to identify the most appropriate normalization methods that improve the structure of the data without introducing bias into the normalized peak intensities.
在典型的比较蛋白质组学实验中,肽鉴定过程中分配的 LC-MS 峰强度的定量会由于技术和生物学变异而在仪器的运行之间发生偏差。因此,LC-MS 蛋白质组学数据集的峰强度的标准化是预处理的基本步骤。然而,LC-MS 蛋白质组学数据的下游分析会受到所选的标准化方法的显著影响。当前的 LC-MS 蛋白质组学数据标准化程序是在源自鉴定肽全集的子集的标准化值的上下文中提出的。这些标准化值的分布是先验未知的。如果它们与实验相关的生物学因素不独立,那么标准化过程可能会给数据带来偏差,可能会影响下游的统计生物标志物发现。我们提出了一种评估标准化策略的新方法,该方法包括与标准化值推导相关的肽选择组件。我们的方法评估了标准化对组间方差结构的影响,以便确定最合适的标准化方法,这些方法可以改善数据的结构,而不会给标准化峰强度带来偏差。