从微阵列数据预测生存率——一项比较研究。

Predicting survival from microarray data--a comparative study.

作者信息

Bøvelstad H M, Nygård S, Størvold H L, Aldrin M, Borgan Ø, Frigessi A, Lingjaerde O C

机构信息

Department of Mathematics, University of Oslo, Norway.

出版信息

Bioinformatics. 2007 Aug 15;23(16):2080-7. doi: 10.1093/bioinformatics/btm305. Epub 2007 Jun 6.

DOI:10.1093/bioinformatics/btm305

PMID:17553857

Abstract

MOTIVATION

Survival prediction from gene expression data and other high-dimensional genomic data has been subject to much research during the last years. These kinds of data are associated with the methodological problem of having many more gene expression values than individuals. In addition, the responses are censored survival times. Most of the proposed methods handle this by using Cox's proportional hazards model and obtain parameter estimates by some dimension reduction or parameter shrinkage estimation technique. Using three well-known microarray gene expression data sets, we compare the prediction performance of seven such methods: univariate selection, forward stepwise selection, principal components regression (PCR), supervised principal components regression, partial least squares regression (PLS), ridge regression and the lasso.

RESULTS

Statistical learning from subsets should be repeated several times in order to get a fair comparison between methods. Methods using coefficient shrinkage or linear combinations of the gene expression values have much better performance than the simple variable selection methods. For our data sets, ridge regression has the overall best performance.

AVAILABILITY

Matlab and R code for the prediction methods are available at http://www.med.uio.no/imb/stat/bmms/software/microsurv/.

摘要

动机

在过去几年中，基于基因表达数据和其他高维基因组数据进行生存预测的研究颇多。这类数据存在一个方法学问题，即基因表达值的数量远多于个体数量。此外，响应变量是截尾生存时间。大多数提出的方法通过使用Cox比例风险模型来处理这个问题，并通过一些降维或参数收缩估计技术获得参数估计值。我们使用三个著名的微阵列基因表达数据集，比较了七种此类方法的预测性能：单变量选择、向前逐步选择、主成分回归（PCR）、监督主成分回归、偏最小二乘回归（PLS）、岭回归和套索回归。

结果

为了在各方法之间进行公平比较，应多次从子集进行统计学习。使用系数收缩或基因表达值线性组合的方法比简单变量选择方法具有更好的性能。对于我们的数据集，岭回归具有总体最佳性能。

可用性

预测方法的Matlab和R代码可在http://www.med.uio.no/imb/stat/bmms/software/microsurv/获取。

相似文献

Predicting survival from microarray data--a comparative study.

Bioinformatics. 2007 Aug 15;23(16):2080-7. doi: 10.1093/bioinformatics/btm305. Epub 2007 Jun 6.

Penalized Cox regression analysis in the high-dimensional and low-sample size settings, with applications to microarray gene expression data.

Bioinformatics. 2005 Jul 1;21(13):3001-8. doi: 10.1093/bioinformatics/bti422. Epub 2005 Apr 6.

Partial least squares dimension reduction for microarray gene expression data with a censored response.

Math Biosci. 2005 Jan;193(1):119-37. doi: 10.1016/j.mbs.2004.10.007. Epub 2005 Jan 22.

Gene selection in cancer classification using sparse logistic regression with Bayesian regularization.

Bioinformatics. 2006 Oct 1;22(19):2348-55. doi: 10.1093/bioinformatics/btl386. Epub 2006 Jul 14.

Partial Cox regression analysis for high-dimensional microarray gene expression data.

Bioinformatics. 2004 Aug 4;20 Suppl 1:i208-15. doi: 10.1093/bioinformatics/bth900.

Independent component analysis-based penalized discriminant method for tumor classification using gene expression data.

Bioinformatics. 2006 Aug 1;22(15):1855-62. doi: 10.1093/bioinformatics/btl190. Epub 2006 May 18.

Microarray-based classification and clinical predictors: on combined classifiers and additional predictive value.

Bioinformatics. 2008 Aug 1;24(15):1698-706. doi: 10.1093/bioinformatics/btn262. Epub 2008 Jun 9.

Dimension reduction methods for microarrays with application to censored survival data.

Bioinformatics. 2004 Dec 12;20(18):3406-12. doi: 10.1093/bioinformatics/bth415. Epub 2004 Jul 15.

Gene selection via the BAHSIC family of algorithms.

Bioinformatics. 2007 Jul 1;23(13):i490-8. doi: 10.1093/bioinformatics/btm216.

The linear neuron as marker selector and clinical predictor in cancer gene analysis.

Comput Methods Programs Biomed. 2008 Jul;91(1):22-35. doi: 10.1016/j.cmpb.2008.02.009.

引用本文的文献

Exploring genomic feature selection: A comparative analysis of GWAS and machine learning algorithms in a large-scale soybean dataset.

Plant Genome. 2025 Mar;18(1):e20503. doi: 10.1002/tpg2.20503. Epub 2024 Sep 10.

Tutorial on survival modeling with applications to omics data.

Bioinformatics. 2024 Mar 4;40(3). doi: 10.1093/bioinformatics/btae132.

Identification of novel molecular subtypes to improve the classification framework of nasopharyngeal carcinoma.

Br J Cancer. 2024 Apr;130(7):1176-1186. doi: 10.1038/s41416-024-02579-w. Epub 2024 Jan 27.

Comprehensive analysis of the clinical and biological significances of cholesterol metabolism in lower-grade gliomas.

BMC Cancer. 2023 Jul 24;23(1):692. doi: 10.1186/s12885-023-10897-0.

Development and validation of a selenium metabolism regulators associated prognostic model for hepatocellular carcinoma.

BMC Cancer. 2023 May 18;23(1):451. doi: 10.1186/s12885-023-10944-w.

Combination of machine learning-based bulk and single-cell genomics reveals necroptosis-related molecular subtypes and immunological features in autism spectrum disorder.

Front Immunol. 2023 Apr 24;14:1139420. doi: 10.3389/fimmu.2023.1139420. eCollection 2023.

Predicting decompression surgery by applying multimodal deep learning to patients' structured and unstructured health data.

BMC Med Inform Decis Mak. 2023 Jan 6;23(1):2. doi: 10.1186/s12911-022-02096-x.

N6-methyladenosine-associated prognostic pseudogenes contribute to predicting immunotherapy benefits and therapeutic agents in head and neck squamous cell carcinoma.

Theranostics. 2022 Oct 17;12(17):7267-7288. doi: 10.7150/thno.76689. eCollection 2022.

Identification of tumor antigens and immunogenic cell death-related subtypes for the improvement of immunotherapy of breast cancer.

Front Cell Dev Biol. 2022 Oct 25;10:962389. doi: 10.3389/fcell.2022.962389. eCollection 2022.

lncRNAs AC156455.1 and AC104532.2 as Biomarkers for Diagnosis and Prognosis in Colorectal Cancer.

Dis Markers. 2022 Oct 13;2022:4872001. doi: 10.1155/2022/4872001. eCollection 2022.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

从微阵列数据预测生存率——一项比较研究。

Predicting survival from microarray data--a comparative study.

作者信息

机构信息

出版信息

MOTIVATION

RESULTS

AVAILABILITY

动机

结果

可用性

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献