增强反向消除法：一种开发统计模型的实用且有目的的方法。

Augmented backward elimination: a pragmatic and purposeful way to develop statistical models.

作者信息

Dunkler Daniela, Plischke Max, Leffondré Karen, Heinze Georg

机构信息

Medical University of Vienna, Center for Medical Statistics, Informatics and Intelligent Systems, Section for Clinical Biometrics, Vienna, Austria.

Medical University of Vienna, Division of Nephrology and Dialysis, Department of Internal Medicine III, Vienna, Austria.

出版信息

PLoS One. 2014 Nov 21;9(11):e113677. doi: 10.1371/journal.pone.0113677. eCollection 2014.

DOI:10.1371/journal.pone.0113677

PMID:25415265

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4240713/

Abstract

Statistical models are simple mathematical rules derived from empirical data describing the association between an outcome and several explanatory variables. In a typical modeling situation statistical analysis often involves a large number of potential explanatory variables and frequently only partial subject-matter knowledge is available. Therefore, selecting the most suitable variables for a model in an objective and practical manner is usually a non-trivial task. We briefly revisit the purposeful variable selection procedure suggested by Hosmer and Lemeshow which combines significance and change-in-estimate criteria for variable selection and critically discuss the change-in-estimate criterion. We show that using a significance-based threshold for the change-in-estimate criterion reduces to a simple significance-based selection of variables, as if the change-in-estimate criterion is not considered at all. Various extensions to the purposeful variable selection procedure are suggested. We propose to use backward elimination augmented with a standardized change-in-estimate criterion on the quantity of interest usually reported and interpreted in a model for variable selection. Augmented backward elimination has been implemented in a SAS macro for linear, logistic and Cox proportional hazards regression. The algorithm and its implementation were evaluated by means of a simulation study. Augmented backward elimination tends to select larger models than backward elimination and approximates the unselected model up to negligible differences in point estimates of the regression coefficients. On average, regression coefficients obtained after applying augmented backward elimination were less biased relative to the coefficients of correctly specified models than after backward elimination. In summary, we propose augmented backward elimination as a reproducible variable selection algorithm that gives the analyst more flexibility in adopting model selection to a specific statistical modeling situation.

摘要

统计模型是从描述结果与几个解释变量之间关联的经验数据中推导出来的简单数学规则。在典型的建模情况下，统计分析通常涉及大量潜在的解释变量，而且往往只有部分主题知识可用。因此，以客观和实际的方式为模型选择最合适的变量通常是一项艰巨的任务。我们简要回顾了Hosmer和Lemeshow提出的有目的变量选择程序，该程序结合了变量选择的显著性和估计变化标准，并对估计变化标准进行了批判性讨论。我们表明，对估计变化标准使用基于显著性的阈值会简化为基于显著性的简单变量选择，就好像根本没有考虑估计变化标准一样。我们提出了对有目的变量选择程序的各种扩展。我们建议使用反向淘汰法，并在通常在模型中报告和解释的感兴趣数量上增加标准化估计变化标准，用于变量选择。增强反向淘汰法已在SAS宏中实现，用于线性、逻辑和Cox比例风险回归。通过模拟研究对该算法及其实现进行了评估。增强反向淘汰法倾向于选择比反向淘汰法更大的模型，并且在回归系数的点估计上与未选择的模型近似，差异可忽略不计。平均而言，应用增强反向淘汰法后获得的回归系数相对于正确设定模型的系数，比反向淘汰法后的偏差更小。总之，我们提出增强反向淘汰法作为一种可重复的变量选择算法，它使分析师在将模型选择应用于特定统计建模情况时具有更大的灵活性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e95a/4240713/c77449b137f6/pone.0113677.g001.jpg

相似文献

Augmented backward elimination: a pragmatic and purposeful way to develop statistical models.

PLoS One. 2014 Nov 21;9(11):e113677. doi: 10.1371/journal.pone.0113677. eCollection 2014.

Bootstrap model selection had similar performance for selecting authentic and noise variables compared to backward variable elimination: a simulation study.

J Clin Epidemiol. 2008 Oct;61(10):1009-17.e1. doi: 10.1016/j.jclinepi.2007.11.014. Epub 2008 Jun 9.

Purposeful selection of variables in logistic regression.

Source Code Biol Med. 2008 Dec 16;3:17. doi: 10.1186/1751-0473-3-17.

Variable selection - A review and recommendations for the practicing statistician.

Biom J. 2018 May;60(3):431-449. doi: 10.1002/bimj.201700067. Epub 2018 Jan 2.

Coupled variable selection for regression modeling of complex treatment patterns in a clinical cancer registry.

Stat Med. 2014 Dec 30;33(30):5358-70. doi: 10.1002/sim.6340. Epub 2014 Oct 27.

Assessing the accuracy and stability of variable selection methods for random forest modeling in ecology.

Environ Monit Assess. 2017 Jul;189(7):316. doi: 10.1007/s10661-017-6025-0. Epub 2017 Jun 6.

Variable selection for clustering with Gaussian mixture models.

Biometrics. 2009 Sep;65(3):701-9. doi: 10.1111/j.1541-0420.2008.01160.x. Epub 2009 Feb 4.

Evaluating variable selection methods for multivariable regression models: A simulation study protocol.

PLoS One. 2024 Aug 9;19(8):e0308543. doi: 10.1371/journal.pone.0308543. eCollection 2024.

Performance of using multiple stepwise algorithms for variable selection.

Stat Med. 2010 Jul 10;29(15):1647-59. doi: 10.1002/sim.3943.

Impact of the 1990 Hong Kong legislation for restriction on sulfur content in fuel.

Res Rep Health Eff Inst. 2012 Aug(170):5-91.

引用本文的文献

Resting-state functional network segregation of the default mode network predicts valence bias across the lifespan.

Imaging Neurosci (Camb). 2024 Dec 19;2. doi: 10.1162/imag_a_00403. eCollection 2024.

Deposition of Mesoporous Silicon Dioxide Films Using Microwave PECVD.

Materials (Basel). 2025 Jul 7;18(13):3205. doi: 10.3390/ma18133205.

The edge orientation entropy of natural scenes is associated with infant visual preferences and adult aesthetic judgements.

PLoS One. 2025 Feb 26;20(2):e0316555. doi: 10.1371/journal.pone.0316555. eCollection 2025.

Procedural Volume and Outcomes After Septal Reduction Therapies in Hypertrophic Obstructive Cardiomyopathy.

J Am Heart Assoc. 2024 Nov 5;13(21):e036387. doi: 10.1161/JAHA.124.036387. Epub 2024 Oct 25.

Adipokines and Myokines as Markers of Malnutrition and Sarcopenia in Patients Receiving Kidney Replacement Therapy: An Observational, Cross-Sectional Study.

Nutrients. 2024 Jul 31;16(15):2480. doi: 10.3390/nu16152480.

Evaluating variable selection methods for multivariable regression models: A simulation study protocol.

PLoS One. 2024 Aug 9;19(8):e0308543. doi: 10.1371/journal.pone.0308543. eCollection 2024.

Clinical Predictors of Cisplatin Chemoradiation-Induced Ototoxicity in HPV-Positive Oropharyngeal Squamous Cell Carcinoma: A Case-Control Study.

J Otolaryngol Head Neck Surg. 2024 Jan-Dec;53:19160216241248671. doi: 10.1177/19160216241248671.

Islet Isolation Outcomes in Patients Undergoing Total Pancreatectomy With Islet Autotransplantation in the POST Consortium.

Transplantation. 2025 Jan 1;109(1):207-216. doi: 10.1097/TP.0000000000005127. Epub 2024 Dec 7.

Determinants of public institutional births in India: An analysis using the National Family Health Survey (NFHS-5) factsheet data.

J Family Med Prim Care. 2024 Apr;13(4):1408-1420. doi: 10.4103/jfmpc.jfmpc_982_23. Epub 2024 Apr 22.

Ensemble machine learning for predicting in-hospital mortality in Asian women with ST-elevation myocardial infarction (STEMI).

Sci Rep. 2024 May 29;14(1):12378. doi: 10.1038/s41598-024-61151-x.

本文引用的文献

Competing risks analyses: objectives and approaches.

Eur Heart J. 2014 Nov 7;35(42):2936-41. doi: 10.1093/eurheartj/ehu131. Epub 2014 Apr 7.

Urine osmolarity and risk of dialysis initiation in a chronic kidney disease cohort--a possible titration target?

PLoS One. 2014 Mar 27;9(3):e93226. doi: 10.1371/journal.pone.0093226. eCollection 2014.

Is a cutoff of 10% appropriate for the change-in-estimate criterion of confounder identification?

J Epidemiol. 2014;24(2):161-7. doi: 10.2188/jea.je20130062. Epub 2013 Dec 7.

Combining directed acyclic graphs and the change-in-estimate procedure as a novel approach to adjustment-variable selection in epidemiology.

BMC Med Res Methodol. 2012 Oct 11;12:156. doi: 10.1186/1471-2288-12-156.

Theorems, proofs, examples, and rules in the practice of epidemiology.

Epidemiology. 2012 May;23(3):443-5. doi: 10.1097/EDE.0b013e31824e2d4e.

A new criterion for confounder selection.

Biometrics. 2011 Dec;67(4):1406-13. doi: 10.1111/j.1541-0420.2011.01619.x. Epub 2011 May 31.

An overview of the objectives of and the approaches to propensity score analyses.

Eur Heart J. 2011 Jul;32(14):1704-8. doi: 10.1093/eurheartj/ehr031. Epub 2011 Feb 28.

On model selection and model misspecification in causal inference.

Stat Methods Med Res. 2012 Feb;21(1):7-30. doi: 10.1177/0962280210387717. Epub 2010 Nov 12.

Understanding confounding and mediation.

Evid Based Ment Health. 2009 Aug;12(3):68-71. doi: 10.1136/ebmh.12.3.68.

Purposeful selection of variables in logistic regression.

Source Code Biol Med. 2008 Dec 16;3:17. doi: 10.1186/1751-0473-3-17.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

增强反向消除法：一种开发统计模型的实用且有目的的方法。

Augmented backward elimination: a pragmatic and purposeful way to develop statistical models.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献