利用蛋白质组学质谱数据制定乳腺癌患者与对照之间的鉴别规则：一种三步法。

Developing a discrimination rule between breast cancer patients and controls using proteomics mass spectrometric data: a three-step approach.

作者信息

Heidema A Geert, Nagelkerke Nico

机构信息

Maastricht University.

出版信息

Stat Appl Genet Mol Biol. 2008;7(2):Article5. doi: 10.2202/1544-6115.1341. Epub 2008 Feb 8.

DOI:10.2202/1544-6115.1341

PMID:18312219

Abstract

To discriminate between breast cancer patients and controls, we used a three-step approach to obtain our decision rule. First, we ranked the mass/charge values using random forests, because it generates importance indices that take possible interactions into account. We observed that the top ranked variables consisted of highly correlated contiguous mass/charge values, which were grouped in the second step into new variables. Finally, these newly created variables were used as predictors to find a suitable discrimination rule. In this last step, we compared three different methods, namely Classification and Regression Tree (CART), logistic regression and penalized logistic regression. Logistic regression and penalized logistic regression performed equally well and both had a higher classification accuracy than CART. The model obtained with penalized logistic regression was chosen as we hypothesized that this model would provide a better classification accuracy in the validation set. The solution had a good performance on the training set with a classification accuracy of 86.3%, and a sensitivity and specificity of 86.8% and 85.7%, respectively.

摘要

为了区分乳腺癌患者和对照组，我们采用了一种三步法来获得决策规则。首先，我们使用随机森林对质荷比（m/z）值进行排序，因为它会生成考虑了可能相互作用的重要性指标。我们观察到排名靠前的变量由高度相关的连续质荷比值组成，这些值在第二步中被组合成新的变量。最后，这些新创建的变量被用作预测变量来寻找合适的判别规则。在最后一步中，我们比较了三种不同的方法，即分类与回归树（CART）、逻辑回归和惩罚逻辑回归。逻辑回归和惩罚逻辑回归表现相当，且两者的分类准确率均高于CART。由于我们假设该模型在验证集中将提供更好的分类准确率，因此选择了通过惩罚逻辑回归获得的模型。该解决方案在训练集上表现良好，分类准确率为86.3%，敏感性和特异性分别为86.8%和85.7%。

相似文献

Developing a discrimination rule between breast cancer patients and controls using proteomics mass spectrometric data: a three-step approach.

Stat Appl Genet Mol Biol. 2008;7(2):Article5. doi: 10.2202/1544-6115.1341. Epub 2008 Feb 8.

A classification model for the Leiden proteomics competition.

Stat Appl Genet Mol Biol. 2008;7(2):Article8. doi: 10.2202/1544-6115.1351. Epub 2008 Feb 19.

Classification of breast cancer versus normal samples from mass spectrometry profiles using linear discriminant analysis of important features selected by random forest.

Stat Appl Genet Mol Biol. 2008;7(2):Article7. doi: 10.2202/1544-6115.1345. Epub 2008 Feb 19.

Application of the random forest classification method to peaks detected from mass spectrometric proteomic profiles of cancer patients and controls.

Stat Appl Genet Mol Biol. 2008;7(2):Article4. doi: 10.2202/1544-6115.1349. Epub 2008 Feb 8.

Support vector machine approach to separate control and breast cancer serum samples.

Stat Appl Genet Mol Biol. 2008;7(2):Article11. doi: 10.2202/1544-6115.1355. Epub 2008 Feb 21.

Empirical Bayes logistic regression.

Stat Appl Genet Mol Biol. 2008;7(2):Article9. doi: 10.2202/1544-6115.1359. Epub 2008 Feb 21.

Combined experimental and statistical strategy for mass spectrometry based serum protein profiling for diagnosis of breast cancer: a case-control study.

J Proteome Res. 2008 Apr;7(4):1419-26. doi: 10.1021/pr7007576. Epub 2008 Feb 28.

Feature extraction and dimensionality reduction for mass spectrometry data.

Comput Biol Med. 2009 Sep;39(9):818-23. doi: 10.1016/j.compbiomed.2009.06.012. Epub 2009 Jul 30.

Discrimination analysis of mass spectrometry proteomics for ovarian cancer detection.

Acta Pharmacol Sin. 2008 Oct;29(10):1240-6. doi: 10.1111/j.1745-7254.2008.00861.x.

Identification of lung cancer patients by serum protein profiling using surface-enhanced laser desorption/ionization time-of-flight mass spectrometry.

Am J Clin Oncol. 2008 Apr;31(2):133-9. doi: 10.1097/COC.0b013e318145b98b.

引用本文的文献

The role of mass spectrometry-based metabolomics in medical countermeasures against radiation.

Mass Spectrom Rev. 2010 May-Jun;29(3):503-21. doi: 10.1002/mas.20272.

Examining the significance of fingerprint-based classifiers.

BMC Bioinformatics. 2008 Dec 17;9:545. doi: 10.1186/1471-2105-9-545.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

利用蛋白质组学质谱数据制定乳腺癌患者与对照之间的鉴别规则：一种三步法。

Developing a discrimination rule between breast cancer patients and controls using proteomics mass spectrometric data: a three-step approach.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献