用于估计微阵列分类中预测误差的自助法与调整后的自助法的比较。

A comparison of bootstrap methods and an adjusted bootstrap approach for estimating the prediction error in microarray classification.

作者信息

Jiang Wenyu, Simon Richard

机构信息

Biometric Research Branch, Division of Cancer Treatment and Diagnosis, National Cancer Institute, National Institutes of Health, 6130 Executive Boulevard, Rockville, MD 20852, USA.

出版信息

Stat Med. 2007 Dec 20;26(29):5320-34. doi: 10.1002/sim.2968.

DOI:10.1002/sim.2968

PMID:17624926

Abstract

This paper first provides a critical review on some existing methods for estimating the prediction error in classifying microarray data where the number of genes greatly exceeds the number of specimens. Special attention is given to the bootstrap-related methods. When the sample size n is small, we find that all the reviewed methods suffer from either substantial bias or variability. We introduce a repeated leave-one-out bootstrap (RLOOB) method that predicts for each specimen in the sample using bootstrap learning sets of size ln. We then propose an adjusted bootstrap (ABS) method that fits a learning curve to the RLOOB estimates calculated with different bootstrap learning set sizes. The ABS method is robust across the situations we investigate and provides a slightly conservative estimate for the prediction error. Even with small samples, it does not suffer from large upward bias as the leave-one-out bootstrap and the 0.632+ bootstrap, and it does not suffer from large variability as the leave-one-out cross-validation in microarray applications.

摘要

本文首先对一些现有方法进行了批判性综述，这些方法用于估计在基因数量大大超过样本数量的情况下对微阵列数据进行分类时的预测误差。特别关注了与自助法相关的方法。当样本量n较小时，我们发现所有综述的方法都存在显著偏差或变异性。我们引入了一种重复留一法自助法（RLOOB），该方法使用大小为ln的自助学习集对样本中的每个样本进行预测。然后，我们提出了一种调整后的自助法（ABS），该方法对使用不同自助学习集大小计算出的RLOOB估计值拟合一条学习曲线。ABS方法在我们研究的各种情况下都很稳健，并为预测误差提供了一个略为保守的估计。即使样本量较小，它也不像留一法自助法和0.632+自助法那样存在较大的向上偏差，也不像微阵列应用中的留一法交叉验证那样存在较大的变异性。

相似文献

A comparison of bootstrap methods and an adjusted bootstrap approach for estimating the prediction error in microarray classification.

Stat Med. 2007 Dec 20;26(29):5320-34. doi: 10.1002/sim.2968.

Calculating confidence intervals for prediction error in microarray classification using resampling.

Stat Appl Genet Mol Biol. 2008;7(1):Article8. doi: 10.2202/1544-6115.1322. Epub 2008 Mar 1.

Empirical Bayes screening of many p-values with applications to microarray studies.

Bioinformatics. 2005 May 1;21(9):1987-94. doi: 10.1093/bioinformatics/bti301. Epub 2005 Feb 2.

Prediction error estimation: a comparison of resampling methods.

Bioinformatics. 2005 Aug 1;21(15):3301-7. doi: 10.1093/bioinformatics/bti499. Epub 2005 May 19.

Estimating misclassification error with small samples via bootstrap cross-validation.

Bioinformatics. 2005 May 1;21(9):1979-86. doi: 10.1093/bioinformatics/bti294. Epub 2005 Feb 2.

Bias in error estimation when using cross-validation for model selection.

BMC Bioinformatics. 2006 Feb 23;7:91. doi: 10.1186/1471-2105-7-91.

A comparison of statistical methods for clustered data analysis with Gaussian error.

Stat Med. 1996 Aug 30;15(16):1793-806. doi: 10.1002/(SICI)1097-0258(19960830)15:16<1793::AID-SIM332>3.0.CO;2-2.

Use of the bootstrap in analysing cost data from cluster randomised trials: some simulation results.

BMC Health Serv Res. 2004 Nov 18;4(1):33. doi: 10.1186/1472-6963-4-33.

Evaluating methods for classifying expression data.

J Biopharm Stat. 2004 Nov;14(4):1065-84. doi: 10.1081/BIP-200035491.

Classification based upon gene expression data: bias and precision of error rates.

Bioinformatics. 2007 Jun 1;23(11):1363-70. doi: 10.1093/bioinformatics/btm117. Epub 2007 Mar 28.

引用本文的文献

Use of Response Permutation to Measure an Imaging Dataset's Susceptibility to Overfitting by Selected Standard Analysis Pipelines.

Acad Radiol. 2024 Sep;31(9):3590-3596. doi: 10.1016/j.acra.2024.02.028. Epub 2024 Apr 12.

Assessing Corneal Endothelial Damage Using Terahertz Time-Domain Spectroscopy and Support Vector Machines.

Sensors (Basel). 2022 Nov 23;22(23):9071. doi: 10.3390/s22239071.

Efficient Evaluation of Prediction Rules in Semi-Supervised Settings under Stratified Sampling.

J R Stat Soc Series B Stat Methodol. 2022 Sep;84(4):1353-1391. doi: 10.1111/rssb.12502. Epub 2022 Apr 26.

Individual-fMRI-approaches reveal cerebellum and visual communities to be functionally connected in obsessive compulsive disorder.

Sci Rep. 2021 Jan 14;11(1):1354. doi: 10.1038/s41598-020-80346-6.

Evaluation of Survival Outcomes of Endovascular Versus Open Aortic Repair for Abdominal Aortic Aneurysms with a Big Data Approach.

Entropy (Basel). 2020 Nov 30;22(12):1349. doi: 10.3390/e22121349.

Identification of gene expression profiles in myocardial infarction: a systematic review and meta-analysis.

BMC Med Genomics. 2018 Nov 27;11(1):109. doi: 10.1186/s12920-018-0427-x.

Determining Cutoff Point of Ensemble Trees Based on Sample Size in Predicting Clinical Dose with DNA Microarray Data.

Comput Math Methods Med. 2016;2016:6794916. doi: 10.1155/2016/6794916. Epub 2016 Dec 20.

Large-scale identification of patients with cerebral aneurysms using natural language processing.

Neurology. 2017 Jan 10;88(2):164-168. doi: 10.1212/WNL.0000000000003490. Epub 2016 Dec 7.

Robust risk prediction with biomarkers under two-phase stratified cohort design.

Biometrics. 2016 Dec;72(4):1037-1045. doi: 10.1111/biom.12515. Epub 2016 Apr 1.

Assessing incremental value of biomarkers with multi-phase nested case-control studies.

Biometrics. 2015 Dec;71(4):1139-49. doi: 10.1111/biom.12344. Epub 2015 Jul 20.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

用于估计微阵列分类中预测误差的自助法与调整后的自助法的比较。

A comparison of bootstrap methods and an adjusted bootstrap approach for estimating the prediction error in microarray classification.

作者信息

Jiang Wenyu, Simon Richard

机构信息

Biometric Research Branch, Division of Cancer Treatment and Diagnosis, National Cancer Institute, National Institutes of Health, 6130 Executive Boulevard, Rockville, MD 20852, USA.

出版信息

Stat Med. 2007 Dec 20;26(29):5320-34. doi: 10.1002/sim.2968.

DOI:10.1002/sim.2968

PMID:17624926

Abstract

摘要

用于估计微阵列分类中预测误差的自助法与调整后的自助法的比较。

A comparison of bootstrap methods and an adjusted bootstrap approach for estimating the prediction error in microarray classification.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

用于估计微阵列分类中预测误差的自助法与调整后的自助法的比较。

A comparison of bootstrap methods and an adjusted bootstrap approach for estimating the prediction error in microarray classification.

作者信息

机构信息

出版信息

相似文献

引用本文的文献