• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于分类的质谱生成临床数据集解释预处理方法比较

Classification-based comparison of pre-processing methods for interpretation of mass spectrometry generated clinical datasets.

作者信息

Wegdam Wouter, Moerland Perry D, Buist Marrije R, van Themaat Emiel Ver Loren, Bleijlevens Boris, Hoefsloot Huub Cj, de Koster Chris G, Aerts Johannes Mfg

机构信息

Department of Gynaecologic Oncology, Academic Medical Center, University of Amsterdam, Amsterdam, the Netherlands.

Bioinformatics Laboratory, Department of Clinical Epidemiology, Biostatistics and Bioinformatics, Academic Medical Center, University of Amsterdam, Amsterdam, the Netherlands.

出版信息

Proteome Sci. 2009 May 14;7:19. doi: 10.1186/1477-5956-7-19.

DOI:10.1186/1477-5956-7-19
PMID:19442271
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2689848/
Abstract

BACKGROUND

Mass spectrometry is increasingly being used to discover proteins or protein profiles associated with disease. Experimental design of mass-spectrometry studies has come under close scrutiny and the importance of strict protocols for sample collection is now understood. However, the question of how best to process the large quantities of data generated is still unanswered. Main challenges for the analysis are the choice of proper pre-processing and classification methods. While these two issues have been investigated in isolation, we propose to use the classification of patient samples as a clinically relevant benchmark for the evaluation of pre-processing methods.

RESULTS

Two in-house generated clinical SELDI-TOF MS datasets are used in this study as an example of high throughput mass-spectrometry data. We perform a systematic comparison of two commonly used pre-processing methods as implemented in Ciphergen ProteinChip Software and in the Cromwell package. With respect to reproducibility, Ciphergen and Cromwell pre-processing are largely comparable. We find that the overlap between peaks detected by either Ciphergen ProteinChip Software or Cromwell is large. This is especially the case for the more stringent peak detection settings. Moreover, similarity of the estimated intensities between matched peaks is high.We evaluate the pre-processing methods using five different classification methods. Classification is done in a double cross-validation protocol using repeated random sampling to obtain an unbiased estimate of classification accuracy. No pre-processing method significantly outperforms the other for all peak detection settings evaluated.

CONCLUSION

We use classification of patient samples as a clinically relevant benchmark for the evaluation of pre-processing methods. Both pre-processing methods lead to similar classification results on an ovarian cancer and a Gaucher disease dataset. However, the settings for pre-processing parameters lead to large differences in classification accuracy and are therefore of crucial importance. We advocate the evaluation over a range of parameter settings when comparing pre-processing methods. Our analysis also demonstrates that reliable classification results can be obtained with a combination of strict sample handling and a well-defined classification protocol on clinical samples.

摘要

背景

质谱分析法越来越多地用于发现与疾病相关的蛋白质或蛋白质谱。质谱研究的实验设计受到了密切审查,现在人们已经认识到严格的样本采集方案的重要性。然而,如何最好地处理大量生成的数据这一问题仍未得到解答。分析的主要挑战在于选择合适的预处理和分类方法。虽然这两个问题已分别进行了研究,但我们建议将患者样本的分类作为评估预处理方法的临床相关基准。

结果

本研究使用了两个内部生成的临床表面增强激光解吸电离飞行时间质谱(SELDI-TOF MS)数据集作为高通量质谱数据的示例。我们对Ciphergen蛋白质芯片软件和Cromwell软件包中实现的两种常用预处理方法进行了系统比较。在可重复性方面,Ciphergen和Cromwell预处理在很大程度上具有可比性。我们发现,Ciphergen蛋白质芯片软件或Cromwell检测到的峰之间的重叠很大。在更严格的峰检测设置下尤其如此。此外,匹配峰之间估计强度的相似度很高。我们使用五种不同的分类方法评估预处理方法。分类是在双重交叉验证方案中进行的,使用重复随机抽样来获得分类准确性的无偏估计。对于所有评估的峰检测设置,没有一种预处理方法明显优于其他方法。

结论

我们将患者样本的分类作为评估预处理方法的临床相关基准。两种预处理方法在卵巢癌和戈谢病数据集上都产生了相似的分类结果。然而,预处理参数的设置导致分类准确性存在很大差异,因此至关重要。我们主张在比较预处理方法时对一系列参数设置进行评估。我们的分析还表明,通过对临床样本进行严格的样本处理和明确的分类方案相结合,可以获得可靠的分类结果。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dcf0/2689848/e746643d3805/1477-5956-7-19-8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dcf0/2689848/1d8084a756ae/1477-5956-7-19-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dcf0/2689848/c50f0e4b4b0e/1477-5956-7-19-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dcf0/2689848/0b923aba2a89/1477-5956-7-19-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dcf0/2689848/ab8a2f4599a1/1477-5956-7-19-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dcf0/2689848/d22c18ebdd90/1477-5956-7-19-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dcf0/2689848/44473bd447f3/1477-5956-7-19-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dcf0/2689848/b0f816ac8219/1477-5956-7-19-7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dcf0/2689848/e746643d3805/1477-5956-7-19-8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dcf0/2689848/1d8084a756ae/1477-5956-7-19-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dcf0/2689848/c50f0e4b4b0e/1477-5956-7-19-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dcf0/2689848/0b923aba2a89/1477-5956-7-19-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dcf0/2689848/ab8a2f4599a1/1477-5956-7-19-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dcf0/2689848/d22c18ebdd90/1477-5956-7-19-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dcf0/2689848/44473bd447f3/1477-5956-7-19-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dcf0/2689848/b0f816ac8219/1477-5956-7-19-7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dcf0/2689848/e746643d3805/1477-5956-7-19-8.jpg

相似文献

1
Classification-based comparison of pre-processing methods for interpretation of mass spectrometry generated clinical datasets.基于分类的质谱生成临床数据集解释预处理方法比较
Proteome Sci. 2009 May 14;7:19. doi: 10.1186/1477-5956-7-19.
2
Comparison of algorithms for pre-processing of SELDI-TOF mass spectrometry data.表面增强激光解吸电离飞行时间质谱(SELDI-TOF)数据预处理算法的比较
Bioinformatics. 2008 Oct 1;24(19):2129-36. doi: 10.1093/bioinformatics/btn398. Epub 2008 Aug 11.
3
Laboratory methods to improve SELDI peak detection and quantitation.用于改善表面增强激光解吸电离飞行时间质谱峰检测与定量的实验室方法。
Proteome Sci. 2007 Jul 2;5:9. doi: 10.1186/1477-5956-5-9.
4
Comparison of software tools to improve the detection of carcinogen induced changes in the rat liver proteome by analyzing SELDI-TOF-MS spectra.通过分析表面增强激光解吸电离飞行时间质谱(SELDI-TOF-MS)光谱比较软件工具以改善致癌物诱导的大鼠肝脏蛋白质组变化的检测。
J Proteome Res. 2006 Feb;5(2):254-61. doi: 10.1021/pr050279o.
5
Dynamic binning peak detection and assessment of various lipidomics liquid chromatography-mass spectrometry pre-processing platforms.动态分箱峰检测和评估各种脂质组学液相色谱-质谱前处理平台。
Anal Chim Acta. 2021 Aug 15;1173:338674. doi: 10.1016/j.aca.2021.338674. Epub 2021 May 25.
6
Proteomic data analysis workflow for discovery of candidate biomarker peaks predictive of clinical outcome for patients with acute myeloid leukemia.用于发现预测急性髓性白血病患者临床结局的候选生物标志物峰的蛋白质组学数据分析流程。
J Proteome Res. 2008 Jun;7(6):2332-41. doi: 10.1021/pr070482e. Epub 2008 May 2.
7
Translational Metabolomics of Head Injury: Exploring Dysfunctional Cerebral Metabolism with Ex Vivo NMR Spectroscopy-Based Metabolite Quantification头部损伤的转化代谢组学:基于体外核磁共振波谱的代谢物定量分析探索脑代谢功能障碍
8
A critical assessment of SELDI-TOF-MS for biomarker discovery in serum and tissue of patients with an ovarian mass.对 SELDI-TOF-MS 在卵巢肿块患者血清和组织中生物标志物发现的批判性评估。
Proteome Sci. 2012 Jul 23;10(1):45. doi: 10.1186/1477-5956-10-45.
9
Identification of protein biomarkers for schizophrenia and bipolar disorder in the postmortem prefrontal cortex using SELDI-TOF-MS ProteinChip profiling combined with MALDI-TOF-PSD-MS analysis.利用表面增强激光解吸电离飞行时间质谱(SELDI-TOF-MS)蛋白质芯片分析结合基质辅助激光解吸电离飞行时间后源衰变质谱(MALDI-TOF-PSD-MS)分析,在死后前额叶皮质中鉴定精神分裂症和双相情感障碍的蛋白质生物标志物。
Neurobiol Dis. 2006 Jul;23(1):61-76. doi: 10.1016/j.nbd.2006.02.002. Epub 2006 Mar 20.
10
Protein expression profiling of postmortem brain in schizophrenia.精神分裂症患者死后大脑的蛋白质表达谱分析。
Schizophr Res. 2006 Jun;84(2-3):204-13. doi: 10.1016/j.schres.2006.02.016. Epub 2006 Apr 19.

引用本文的文献

1
A critical assessment of SELDI-TOF-MS for biomarker discovery in serum and tissue of patients with an ovarian mass.对 SELDI-TOF-MS 在卵巢肿块患者血清和组织中生物标志物发现的批判性评估。
Proteome Sci. 2012 Jul 23;10(1):45. doi: 10.1186/1477-5956-10-45.
2
Automatic selection of preprocessing methods for improving predictions on mass spectrometry protein profiles.自动选择预处理方法以改善对质谱蛋白质谱的预测。
AMIA Annu Symp Proc. 2010 Nov 13;2010:632-6.
3
Quadratic variance models for adaptively preprocessing SELDI-TOF mass spectrometry data.

本文引用的文献

1
Clinical proteomics: A need to define the field and to begin to set adequate standards.临床蛋白质组学:需要定义该领域并开始制定适当的标准。
Proteomics Clin Appl. 2007 Feb;1(2):148-56. doi: 10.1002/prca.200600771. Epub 2007 Jan 22.
2
Impact of freeze-thaw cycles and storage time on plasma samples used in mass spectrometry based biomarker discovery projects.冻融循环和储存时间对基于质谱的生物标志物发现项目中使用的血浆样本的影响。
Cancer Inform. 2005;1(1):98-104.
3
Benchmarking currently available SELDI-TOF MS preprocessing techniques.对当前可用的表面增强激光解吸电离飞行时间质谱预处理技术进行基准测试。
自适应预处理 SELDI-TOF 质谱数据的二次方差模型。
BMC Bioinformatics. 2010 Oct 13;11:512. doi: 10.1186/1471-2105-11-512.
4
Challenges for biomarker discovery in body fluids using SELDI-TOF-MS.使用表面增强激光解吸电离飞行时间质谱(SELDI-TOF-MS)在体液中发现生物标志物面临的挑战。
J Biomed Biotechnol. 2010;2010:906082. doi: 10.1155/2010/906082. Epub 2009 Dec 6.
Proteomics. 2009 Apr;9(7):1754-62. doi: 10.1002/pmic.200701171.
4
Comparison of algorithms for pre-processing of SELDI-TOF mass spectrometry data.表面增强激光解吸电离飞行时间质谱(SELDI-TOF)数据预处理算法的比较
Bioinformatics. 2008 Oct 1;24(19):2129-36. doi: 10.1093/bioinformatics/btn398. Epub 2008 Aug 11.
5
Comparison of normalisation methods for surface-enhanced laser desorption and ionisation (SELDI) time-of-flight (TOF) mass spectrometry data.表面增强激光解吸电离(SELDI)飞行时间(TOF)质谱数据归一化方法的比较
BMC Bioinformatics. 2008 Feb 7;9:88. doi: 10.1186/1471-2105-9-88.
6
How to distinguish healthy from diseased? Classification strategy for mass spectrometry-based clinical proteomics.如何区分健康与疾病?基于质谱的临床蛋白质组学分类策略。
Proteomics. 2007 Oct;7(20):3672-80. doi: 10.1002/pmic.200700046.
7
Mass spectrometry: uncovering the cancer proteome for diagnostics.质谱分析:揭示用于诊断的癌症蛋白质组。
Adv Cancer Res. 2007;96:23-50. doi: 10.1016/S0065-230X(06)96002-3.
8
Improved peak detection in mass spectrum by incorporating continuous wavelet transform-based pattern matching.通过结合基于连续小波变换的模式匹配改进质谱中的峰检测。
Bioinformatics. 2006 Sep 1;22(17):2059-65. doi: 10.1093/bioinformatics/btl355. Epub 2006 Jul 4.
9
A compendium to ensure computational reproducibility in high-dimensional classification tasks.确保高维分类任务中计算可重复性的纲要。
Stat Appl Genet Mol Biol. 2004;3:Article37. doi: 10.2202/1544-6115.1078. Epub 2004 Dec 19.
10
Bias in error estimation when using cross-validation for model selection.在使用交叉验证进行模型选择时误差估计中的偏差。
BMC Bioinformatics. 2006 Feb 23;7:91. doi: 10.1186/1471-2105-7-91.