Choi Meena, Eren-Dogu Zeynep F, Colangelo Christopher, Cottrell John, Hoopmann Michael R, Kapp Eugene A, Kim Sangtae, Lam Henry, Neubert Thomas A, Palmblad Magnus, Phinney Brett S, Weintraub Susan T, MacLean Brendan, Vitek Olga
Northeastern University , Boston, Massachusetts 02115, United States.
Mugla Sitki Kocman University , 48000 Mugla, Turkey.
J Proteome Res. 2017 Feb 3;16(2):945-957. doi: 10.1021/acs.jproteome.6b00881. Epub 2017 Jan 3.
Detection of differentially abundant proteins in label-free quantitative shotgun liquid chromatography-tandem mass spectrometry (LC-MS/MS) experiments requires a series of computational steps that identify and quantify LC-MS features. It also requires statistical analyses that distinguish systematic changes in abundance between conditions from artifacts of biological and technical variation. The 2015 study of the Proteome Informatics Research Group (iPRG) of the Association of Biomolecular Resource Facilities (ABRF) aimed to evaluate the effects of the statistical analysis on the accuracy of the results. The study used LC-tandem mass spectra acquired from a controlled mixture, and made the data available to anonymous volunteer participants. The participants used methods of their choice to detect differentially abundant proteins, estimate the associated fold changes, and characterize the uncertainty of the results. The study found that multiple strategies (including the use of spectral counts versus peak intensities, and various software tools) could lead to accurate results, and that the performance was primarily determined by the analysts' expertise. This manuscript summarizes the outcome of the study, and provides representative examples of good computational and statistical practice. The data set generated as part of this study is publicly available.
在无标记定量鸟枪法液相色谱-串联质谱(LC-MS/MS)实验中检测差异丰度蛋白质,需要一系列用于识别和定量LC-MS特征的计算步骤。它还需要进行统计分析,以区分不同条件下丰度的系统性变化与生物学和技术变异产生的假象。生物分子资源设施协会(ABRF)的蛋白质组信息学研究小组(iPRG)在2015年开展的研究旨在评估统计分析对结果准确性的影响。该研究使用从一种受控混合物中获取的LC-串联质谱,并将数据提供给匿名志愿者参与者。参与者使用他们选择的方法来检测差异丰度蛋白质、估计相关的倍数变化,并描述结果的不确定性。研究发现,多种策略(包括使用谱图计数与峰强度,以及各种软件工具)都能得出准确结果,并且性能主要由分析人员的专业知识决定。本手稿总结了该研究的结果,并提供了良好计算和统计实践的代表性示例。作为本研究一部分生成的数据集可公开获取。