卵巢癌血清蛋白质组分析的数据回顾与重新评估

A data review and re-assessment of ovarian cancer serum proteomic profiling.

作者信息

Sorace James M, Zhan Min

机构信息

Department of Pathology and Laboratory Services, Veterans Administration Maryland Health Care System, Baltimore 21201, USA.

出版信息

BMC Bioinformatics. 2003 Jun 9;4:24. doi: 10.1186/1471-2105-4-24.

DOI:10.1186/1471-2105-4-24

PMID:12795817

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC165662/

Abstract

BACKGROUND

The early detection of ovarian cancer has the potential to dramatically reduce mortality. Recently, the use of mass spectrometry to develop profiles of patient serum proteins, combined with advanced data mining algorithms has been reported as a promising method to achieve this goal. In this report, we analyze the Ovarian Dataset 8-7-02 downloaded from the Clinical Proteomics Program Databank website, using nonparametric statistics and stepwise discriminant analysis to develop rules to diagnose patients, as well as to understand general patterns in the data that may guide future research.

RESULTS

The mass spectrometry serum profiles derived from cancer and controls exhibited numerous statistical differences. For example, use of the Wilcoxon test in comparing the intensity at each of the 15,154 mass to charge (M/Z) values between the cancer and controls, resulted in the detection of 3,591 M/Z values whose intensities differed by a p-value of 10-6 or less. The region containing the M/Z values of greatest statistical difference between cancer and controls occurred at M/Z values less than 500. For example the M/Z values of 2.7921478 and 245.53704 could be used to significantly separate the cancer from control groups. Three other sets of M/Z values were developed using a training set that could distinguish between cancer and control subjects in a test set with 100% sensitivity and specificity.

CONCLUSION

The ability to discriminate between cancer and control subjects based on the M/Z values of 2.7921478 and 245.53704 reveals the existence of a significant non-biologic experimental bias between these two groups. This bias may invalidate attempts to use this dataset to find patterns of reproducible diagnostic value. To minimize false discovery, results using mass spectrometry and data mining algorithms should be carefully reviewed and benchmarked with routine statistical methods.

摘要

背景

卵巢癌的早期检测有可能显著降低死亡率。最近，有报道称，将质谱技术用于生成患者血清蛋白图谱，并结合先进的数据挖掘算法，是实现这一目标的一种很有前景的方法。在本报告中，我们分析了从临床蛋白质组学计划数据库网站下载的卵巢数据集8 - 7 - 02，使用非参数统计和逐步判别分析来制定诊断患者的规则，并了解数据中的一般模式，以指导未来的研究。

结果

来自癌症患者和对照组的质谱血清图谱显示出许多统计学差异。例如，使用威尔科克森检验比较癌症患者和对照组之间15154个质荷比（M/Z）值处的强度，结果检测到3591个M/Z值，其强度差异的p值为10的 - 6次方或更小。癌症患者和对照组之间统计学差异最大的M/Z值所在区域出现在M/Z值小于500的范围内。例如，M/Z值2.7921478和245.53704可用于显著区分癌症组和对照组。另外还使用一个训练集开发了另外三组M/Z值，该训练集能够在测试集中以100%的灵敏度和特异性区分癌症患者和对照对象。

结论

基于M/Z值2.7921478和245.53704区分癌症患者和对照对象的能力揭示了这两组之间存在显著的非生物学实验偏差。这种偏差可能会使利用该数据集寻找具有可重复诊断价值的模式的尝试无效。为了尽量减少错误发现，应仔细审查使用质谱和数据挖掘算法得到的结果，并用常规统计方法进行基准测试。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ea00/165662/92d4d2e6bd62/1471-2105-4-24-1.jpg

相似文献

A data review and re-assessment of ovarian cancer serum proteomic profiling.

BMC Bioinformatics. 2003 Jun 9;4:24. doi: 10.1186/1471-2105-4-24.

Proteomic studies of early-stage and advanced ovarian cancer patients.

Gynecol Oncol. 2008 Oct;111(1):111-9. doi: 10.1016/j.ygyno.2008.06.031. Epub 2008 Aug 15.

Application of serum protein fingerprinting coupled with artificial neural network model in diagnosis of hepatocellular carcinoma.

Chin Med J (Engl). 2005 Aug 5;118(15):1278-84.

Simultaneous and exact interval estimates for the contrast of two groups based on an extremely high dimensional variable: application to mass spec data.

Bioinformatics. 2007 Jun 15;23(12):1451-8. doi: 10.1093/bioinformatics/btm130. Epub 2007 Apr 25.

A data-analytic strategy for protein biomarker discovery: profiling of high-dimensional proteomic data for cancer detection.

Biostatistics. 2003 Jul;4(3):449-63. doi: 10.1093/biostatistics/4.3.449.

Data mining techniques for cancer detection using serum proteomic profiling.

Artif Intell Med. 2004 Oct;32(2):71-83. doi: 10.1016/j.artmed.2004.03.006.

[Detection and clinical significance of serum proteomic patterns of breast cancers by surface enhanced laser desorption/ionization time of flight mass spectrometry].

Zhonghua Zhong Liu Za Zhi. 2006 Mar;28(3):204-7.

Three new potential ovarian cancer biomarkers detected in human urine with equalizer bead technology.

Acta Obstet Gynecol Scand. 2009;88(1):18-26. doi: 10.1080/00016340802443830.

Combined experimental and statistical strategy for mass spectrometry based serum protein profiling for diagnosis of breast cancer: a case-control study.

J Proteome Res. 2008 Apr;7(4):1419-26. doi: 10.1021/pr7007576. Epub 2008 Feb 28.

Identification of biomarkers from mass spectrometry data using a "common" peak approach.

BMC Bioinformatics. 2006 Jul 26;7:358. doi: 10.1186/1471-2105-7-358.

引用本文的文献

FUNCTION-ON-SCALAR QUANTILE REGRESSION WITH APPLICATION TO MASS SPECTROMETRY PROTEOMICS DATA.

Ann Appl Stat. 2020 Jun;14(2):521-541. doi: 10.1214/19-aoas1319. Epub 2020 Jun 29.

On Comprehensive Mass Spectrometry Data Analysis for Proteome Profiling of Human Blood Samples.

J Healthc Inform Res. 2018 May 22;2(3):305-318. doi: 10.1007/s41666-018-0022-0. eCollection 2018 Sep.

Damage-Net: A program for DNA repair meta-analysis identifies a network of novel repair genes that facilitate cancer evolution.

DNA Repair (Amst). 2021 Sep;105:103158. doi: 10.1016/j.dnarep.2021.103158. Epub 2021 Jun 10.

Importance of Block Randomization When Designing Proteomics Experiments.

J Proteome Res. 2021 Jan 1;20(1):122-128. doi: 10.1021/acs.jproteome.0c00536. Epub 2020 Oct 5.

Phylostratigraphic analysis of gene co-expression network reveals the evolution of functional modules for ovarian cancer.

Sci Rep. 2019 Feb 22;9(1):2623. doi: 10.1038/s41598-019-40023-9.

Machine Learning and Radiogenomics: Lessons Learned and Future Directions.

Front Oncol. 2018 Jun 21;8:228. doi: 10.3389/fonc.2018.00228. eCollection 2018.

Statistical Contributions to Bioinformatics: Design, Modeling, Structure Learning, and Integration.

Stat Modelling. 2017;17(4-5):245-289. doi: 10.1177/1471082X17698255. Epub 2017 Jun 15.

Freeze-dried plasma proteins are stable at room temperature for at least 1 year.

Clin Proteomics. 2017 Oct 27;14:35. doi: 10.1186/s12014-017-9170-0. eCollection 2017.

Proteomic Workflows for Biomarker Identification Using Mass Spectrometry - Technical and Statistical Considerations during Initial Discovery.

Proteomes. 2013 Aug 27;1(2):109-127. doi: 10.3390/proteomes1020109.

Proteomic Profiling of Serial Prediagnostic Serum Samples for Early Detection of Colon Cancer in the U.S. Military.

Cancer Epidemiol Biomarkers Prev. 2017 May;26(5):711-718. doi: 10.1158/1055-9965.EPI-16-0732. Epub 2016 Dec 21.

本文引用的文献

A comprehensive approach to the analysis of matrix-assisted laser desorption/ionization-time of flight proteomics spectra from serum samples.

Proteomics. 2003 Sep;3(9):1667-72. doi: 10.1002/pmic.200300522.

Proteomic approaches to tumor marker discovery.

Arch Pathol Lab Med. 2002 Dec;126(12):1518-26. doi: 10.5858/2002-126-1518-PATTMD.

Genomics and proteomics: application of novel technology to early detection and prevention of cancer.

Cancer Detect Prev. 2002;26(4):249-55. doi: 10.1016/s0361-090x(02)00092-2.

Serum proteomic patterns for detection of prostate cancer.

J Natl Cancer Inst. 2002 Oct 16;94(20):1576-8. doi: 10.1093/jnci/94.20.1576.

Boosted decision tree analysis of surface-enhanced laser desorption/ionization mass spectral serum profiles discriminates prostate cancer from noncancer patients.

Clin Chem. 2002 Oct;48(10):1835-43.

Clinical proteomics: translating benchside promise into bedside reality.

Nat Rev Drug Discov. 2002 Sep;1(9):683-95. doi: 10.1038/nrd891.

Normal, benign, preneoplastic, and malignant prostate cells have distinct protein expression profiles resolved by surface enhanced laser desorption/ionization mass spectrometry.

Clin Cancer Res. 2002 Aug;8(8):2541-52.

Proteomics for cancer biomarker discovery.

Clin Chem. 2002 Aug;48(8):1160-9.

Serum protein fingerprinting coupled with a pattern-matching algorithm distinguishes prostate cancer from benign prostate hyperplasia and healthy men.

Cancer Res. 2002 Jul 1;62(13):3609-14.

Plasma lysophosphatidic acid concentration and ovarian cancer.

JAMA. 2002 Jun 19;287(23):3081-2. doi: 10.1001/jama.287.23.3081.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

卵巢癌血清蛋白质组分析的数据回顾与重新评估

A data review and re-assessment of ovarian cancer serum proteomic profiling.

作者信息

Sorace James M, Zhan Min

机构信息

Department of Pathology and Laboratory Services, Veterans Administration Maryland Health Care System, Baltimore 21201, USA.