Baggerly Keith A, Morris Jeffrey S, Coombes Kevin R
Department of Biostatistics, U.T. M.D. Anderson Cancer Center, 1515 Holcombe Blvd, Box 447, Houston, TX 77030-4009, USA.
Bioinformatics. 2004 Mar 22;20(5):777-85. doi: 10.1093/bioinformatics/btg484. Epub 2004 Jan 29.
There has been much interest in using patterns derived from surface-enhanced laser desorption and ionization (SELDI) protein mass spectra from serum to differentiate samples from patients both with and without disease. Such patterns have been used without identification of the underlying proteins responsible. However, there are questions as to the stability of this procedure over multiple experiments.
We compared SELDI proteomic spectra from serum from three experiments by the same group on separating ovarian cancer from normal tissue. These spectra are available on the web at http://clinicalproteomics.steem.com. In general, the results were not reproducible across experiments. Baseline correction prevents reproduction of the results for two of the experiments. In one experiment, there is evidence of a major shift in protocol mid-experiment which could bias the results. In another, structure in the noise regions of the spectra allows us to distinguish normal from cancer, suggesting that the normals and cancers were processed differently. Sets of features found to discriminate well in one experiment do not generalize to other experiments. Finally, the mass calibration in all three experiments appears suspect. Taken together, these and other concerns suggest that much of the structure uncovered in these experiments could be due to artifacts of sample processing, not to the underlying biology of cancer. We provide some guidelines for design and analysis in experiments like these to ensure better reproducible, biologically meaningfully results.
The MATLAB and Perl code used in our analyses is available at http://bioinformatics.mdanderson.org
利用从血清的表面增强激光解吸电离(SELDI)蛋白质质谱中获得的模式来区分患病人群和未患病人群的样本,这一做法引起了广泛关注。此类模式在未鉴定出潜在责任蛋白的情况下就已被使用。然而,对于该程序在多个实验中的稳定性存在疑问。
我们比较了同一研究小组进行的三项实验中血清的SELDI蛋白质组学光谱,这些实验旨在区分卵巢癌组织和正常组织。这些光谱可在网页http://clinicalproteomics.steem.com上获取。总体而言,不同实验的结果无法重现。基线校正导致其中两项实验的结果无法重现。在一项实验中,有证据表明实验过程中方案发生了重大变化,这可能会使结果产生偏差。在另一项实验中,光谱噪声区域的结构使我们能够区分正常样本和癌样本,这表明正常样本和癌样本的处理方式有所不同。在一项实验中发现的能够很好区分的特征集无法推广到其他实验。最后,所有三项实验中的质量校准似乎都存在问题。综上所述,这些问题以及其他问题表明,这些实验中发现的许多结构可能是由于样本处理的人为因素造成的,而非癌症的潜在生物学特性。我们为这类实验的设计和分析提供了一些指导方针,以确保获得更具可重复性、生物学意义明确的结果。
我们分析中使用的MATLAB和Perl代码可在http://bioinformatics.mdanderson.org获取