利用血清蛋白质组分析进行癌症检测的数据挖掘技术

Data mining techniques for cancer detection using serum proteomic profiling.

作者信息

Li Lihua, Tang Hong, Wu Zuobao, Gong Jianli, Gruidl Michael, Zou Jun, Tockman Melvyn, Clark Robert A

机构信息

Department of Radiology, College of Medicine, H. Lee Moffitt Cancer Center and Research Institute, University of South Florida, Tampa, FL 33612-4799, USA.

出版信息

Artif Intell Med. 2004 Oct;32(2):71-83. doi: 10.1016/j.artmed.2004.03.006.

DOI:10.1016/j.artmed.2004.03.006

PMID:15364092

Abstract

OBJECTIVE

Pathological changes in an organ or tissue may be reflected in proteomic patterns in serum. It is possible that unique serum proteomic patterns could be used to discriminate cancer samples from non-cancer ones. Due to the complexity of proteomic profiling, a higher order analysis such as data mining is needed to uncover the differences in complex proteomic patterns. The objectives of this paper are (1) to briefly review the application of data mining techniques in proteomics for cancer detection/diagnosis; (2) to explore a novel analytic method with different feature selection methods; (3) to compare the results obtained on different datasets and that reported by Petricoin et al. in terms of detection performance and selected proteomic patterns.

METHODS AND MATERIAL

Three serum SELDI MS data sets were used in this research to identify serum proteomic patterns that distinguish the serum of ovarian cancer cases from non-cancer controls. A support vector machine-based method is applied in this study, in which statistical testing and genetic algorithm-based methods are used for feature selection respectively. Leave-one-out cross validation with receiver operating characteristic (ROC) curve is used for evaluation and comparison of cancer detection performance.

RESULTS AND CONCLUSIONS

The results showed that (1) data mining techniques can be successfully applied to ovarian cancer detection with a reasonably high performance; (2) the classification using features selected by the genetic algorithm consistently outperformed those selected by statistical testing in terms of accuracy and robustness; (3) the discriminatory features (proteomic patterns) can be very different from one selection method to another. In other words, the pattern selection and its classification efficiency are highly classifier dependent. Therefore, when using data mining techniques, the discrimination of cancer from normal does not depend solely upon the identity and origination of cancer-related proteins.

摘要

目的

器官或组织的病理变化可能反映在血清蛋白质组模式中。独特的血清蛋白质组模式有可能用于区分癌症样本和非癌症样本。由于蛋白质组分析的复杂性，需要诸如数据挖掘等高阶分析来揭示复杂蛋白质组模式中的差异。本文的目的是：（1）简要回顾数据挖掘技术在蛋白质组学中用于癌症检测/诊断的应用；（2）探索一种采用不同特征选择方法的新型分析方法；（3）在检测性能和所选蛋白质组模式方面，比较在不同数据集上获得的结果以及Petricoin等人报告的结果。

方法与材料

本研究使用三个血清SELDI MS数据集来识别区分卵巢癌病例血清与非癌症对照血清的蛋白质组模式。本研究应用了一种基于支持向量机的方法，其中分别使用统计检验和基于遗传算法的方法进行特征选择。采用留一法交叉验证和受试者工作特征（ROC）曲线来评估和比较癌症检测性能。

结果与结论

结果表明：（1）数据挖掘技术可以成功应用于卵巢癌检测，且性能相当高；（2）就准确性和稳健性而言，使用遗传算法选择的特征进行分类始终优于使用统计检验选择的特征进行的分类；（3）不同选择方法的鉴别特征（蛋白质组模式）可能非常不同。换句话说，模式选择及其分类效率高度依赖于分类器。因此，在使用数据挖掘技术时，癌症与正常的区分不仅仅取决于癌症相关蛋白质的身份和来源。

相似文献

Data mining techniques for cancer detection using serum proteomic profiling.

Artif Intell Med. 2004 Oct;32(2):71-83. doi: 10.1016/j.artmed.2004.03.006.

Application of serum SELDI proteomic patterns in diagnosis of lung cancer.

BMC Cancer. 2005 Jul 20;5:83. doi: 10.1186/1471-2407-5-83.

Artificial neural networks and decision tree model analysis of liver cancer proteomes.

Biochem Biophys Res Commun. 2007 Sep 14;361(1):68-73. doi: 10.1016/j.bbrc.2007.06.172. Epub 2007 Jul 10.

A data review and re-assessment of ovarian cancer serum proteomic profiling.

BMC Bioinformatics. 2003 Jun 9;4:24. doi: 10.1186/1471-2105-4-24.

Proteomic studies of early-stage and advanced ovarian cancer patients.

Gynecol Oncol. 2008 Oct;111(1):111-9. doi: 10.1016/j.ygyno.2008.06.031. Epub 2008 Aug 15.

Neuroblastoma detection using serum proteomic profiling: a novel mining technique for cancer?

J Pediatr Surg. 2006 Apr;41(4):639-46; discussion 639-46. doi: 10.1016/j.jpedsurg.2005.12.037.

Using proteomic approaches to identify new biomarkers for detection and monitoring of ovarian cancer.

Gynecol Oncol. 2006 Feb;100(2):247-53. doi: 10.1016/j.ygyno.2005.08.051. Epub 2005 Oct 17.

Functional genomics and proteomics in the clinical neurosciences: data mining and bioinformatics.

Prog Brain Res. 2006;158:83-108. doi: 10.1016/S0079-6123(06)58004-5.

Proteomic tracking of serum protein isoforms as screening biomarkers of ovarian cancer.

Proteomics. 2005 Nov;5(17):4625-36. doi: 10.1002/pmic.200401321.

Proteomic profiling identifies afamin as a potential biomarker for ovarian cancer.

Clin Cancer Res. 2007 Dec 15;13(24):7370-9. doi: 10.1158/1078-0432.CCR-07-0747.

引用本文的文献

AI-Derived Blood Biomarkers for Ovarian Cancer Diagnosis: Systematic Review and Meta-Analysis.

J Med Internet Res. 2025 Mar 24;27:e67922. doi: 10.2196/67922.

Integrative machine learning frameworks to uncover specific protein signature in neuroendocrine cervical carcinoma.

BMC Cancer. 2025 Jan 10;25(1):57. doi: 10.1186/s12885-025-13454-z.

AI-driven eyelid tumor classification in ocular oncology using proteomic data.

NPJ Precis Oncol. 2024 Dec 23;8(1):289. doi: 10.1038/s41698-024-00767-8.

Predicting mortality in brain stroke patients using neural networks: outcomes analysis in a longitudinal study.

Sci Rep. 2023 Oct 28;13(1):18530. doi: 10.1038/s41598-023-45877-8.

MSFC: a new feature construction method for accurate diagnosis of mass spectrometry data.

Sci Rep. 2023 Sep 21;13(1):15694. doi: 10.1038/s41598-023-42395-5.

Biomarker Discovery by Imperialist Competitive Algorithm in Mass Spectrometry Data for Ovarian Cancer Prediction.

J Med Signals Sens. 2021 May 24;11(2):108-119. doi: 10.4103/jmss.JMSS_20_20. eCollection 2021 Apr-Jun.

The outcome in patients with brain stroke: A deep learning neural network modeling.

J Res Med Sci. 2020 Aug 24;25:78. doi: 10.4103/jrms.JRMS_268_20. eCollection 2020.

Integrated Chemometrics and Statistics to Drive Successful Proteomics Biomarker Discovery.

Proteomes. 2018 Apr 26;6(2):20. doi: 10.3390/proteomes6020020.

Evolution of extrema features reveals optimal stimuli for biological state transitions.

Sci Rep. 2018 Feb 21;8(1):3403. doi: 10.1038/s41598-018-21761-8.

Influence of honeybee sting on peptidome profile in human serum.

Toxins (Basel). 2015 May 22;7(5):1808-20. doi: 10.3390/toxins7051808.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

利用血清蛋白质组分析进行癌症检测的数据挖掘技术

Data mining techniques for cancer detection using serum proteomic profiling.

作者信息

Li Lihua, Tang Hong, Wu Zuobao, Gong Jianli, Gruidl Michael, Zou Jun, Tockman Melvyn, Clark Robert A

机构信息

Department of Radiology, College of Medicine, H. Lee Moffitt Cancer Center and Research Institute, University of South Florida, Tampa, FL 33612-4799, USA.

出版信息

Artif Intell Med. 2004 Oct;32(2):71-83. doi: 10.1016/j.artmed.2004.03.006.

DOI:10.1016/j.artmed.2004.03.006

PMID:15364092

Abstract

OBJECTIVE

METHODS AND MATERIAL

RESULTS AND CONCLUSIONS

摘要

利用血清蛋白质组分析进行癌症检测的数据挖掘技术

Data mining techniques for cancer detection using serum proteomic profiling.

作者信息

机构信息

出版信息

OBJECTIVE

METHODS AND MATERIAL

RESULTS AND CONCLUSIONS

目的

方法与材料

结果与结论

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

利用血清蛋白质组分析进行癌症检测的数据挖掘技术

Data mining techniques for cancer detection using serum proteomic profiling.

作者信息

机构信息

出版信息

OBJECTIVE

METHODS AND MATERIAL

RESULTS AND CONCLUSIONS

目的

方法与材料

结果与结论

相似文献

引用本文的文献