通过逐步回归对具有高维预测变量的广义线性模型进行一致估计。

Consistent Estimation of Generalized Linear Models with High Dimensional Predictors via Stepwise Regression.

作者信息

Pijyan Alex, Zheng Qi, Hong Hyokyoung G, Li Yi

机构信息

Department of Statistics and Probability, Michigan State University, East Lansing, MI 48824, USA.

Department of Bioinformatics and Biostatistics, University of Louisville, Louisville, KY 40202, USA.

出版信息

Entropy (Basel). 2020 Aug 31;22(9):965. doi: 10.3390/e22090965.

DOI:10.3390/e22090965

PMID:33286734

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7597260/

Abstract

Predictive models play a central role in decision making. Penalized regression approaches, such as least absolute shrinkage and selection operator (LASSO), have been widely used to construct predictive models and explain the impacts of the selected predictors, but the estimates are typically biased. Moreover, when data are ultrahigh-dimensional, penalized regression is usable only after applying variable screening methods to downsize variables. We propose a stepwise procedure for fitting generalized linear models with ultrahigh dimensional predictors. Our procedure can provide a final model; control both false negatives and false positives; and yield consistent estimates, which are useful to gauge the actual effect size of risk factors. Simulations and applications to two clinical studies verify the utility of the method.

摘要

预测模型在决策中起着核心作用。惩罚回归方法，如最小绝对收缩和选择算子（LASSO），已被广泛用于构建预测模型并解释所选预测变量的影响，但估计通常存在偏差。此外，当数据是超高维时，惩罚回归只有在应用变量筛选方法来减少变量数量后才能使用。我们提出了一种用于拟合具有超高维预测变量的广义线性模型的逐步程序。我们的程序可以提供一个最终模型；控制假阴性和假阳性；并产生一致的估计，这对于衡量风险因素的实际效应大小很有用。对两项临床研究的模拟和应用验证了该方法的实用性。

相似文献

Consistent Estimation of Generalized Linear Models with High Dimensional Predictors via Stepwise Regression.通过逐步回归对具有高维预测变量的广义线性模型进行一致估计。

Entropy (Basel). 2020 Aug 31;22(9):965. doi: 10.3390/e22090965.

Estimation and Selection via Absolute Penalized Convex Minimization And Its Multistage Adaptive Applications.通过绝对惩罚凸最小化进行估计与选择及其多阶段自适应应用

J Mach Learn Res. 2012 Jun 1;13:1839-1864.

On the robustness of the adaptive lasso to model misspecification.关于自适应套索对模型误设的稳健性。

Biometrika. 2012 Sep;99(3):717-731. doi: 10.1093/biomet/ass027. Epub 2012 Jul 11.

On the impact of model selection on predictor identification and parameter inference.论模型选择对预测变量识别和参数推断的影响。

Comput Stat. 2017;32(2):667-690. doi: 10.1007/s00180-016-0690-2. Epub 2016 Oct 22.

Optimism Bias Correction in Omics Studies with Big Data: Assessment of Penalized Methods on Simulated Data.基于大数据的组学研究中的乐观偏差校正：模拟数据上惩罚方法的评估。

OMICS. 2019 Apr;23(4):207-213. doi: 10.1089/omi.2018.0191. Epub 2019 Feb 22.

Simulation-selection-extrapolation: Estimation in high-dimensional errors-in-variables models.模拟-选择-外推法：高维变量误差模型中的估计

Biometrics. 2019 Dec;75(4):1133-1144. doi: 10.1111/biom.13112. Epub 2019 Aug 28.

Feature-specific inference for penalized regression using local false discovery rates.使用局部错误发现率进行惩罚回归的特征特定推断。

Stat Med. 2023 Apr 30;42(9):1412-1429. doi: 10.1002/sim.9678. Epub 2023 Feb 3.

Building generalized linear models with ultrahigh dimensional features: A sequentially conditional approach.超高维特征的广义线性模型构建：一种序贯条件方法。

Biometrics. 2020 Mar;76(1):47-60. doi: 10.1111/biom.13122. Epub 2019 Nov 6.

False discovery control for penalized variable selections with high-dimensional covariates.具有高维协变量的惩罚变量选择的错误发现控制

Stat Appl Genet Mol Biol. 2018 Dec 15;17(6):/j/sagmb.2018.17.issue-6/sagmb-2018-0038/sagmb-2018-0038.xml. doi: 10.1515/sagmb-2018-0038.

Statistical Inference for High-Dimensional Models via Recursive Online-Score Estimation.通过递归在线得分估计对高维模型进行统计推断。

J Am Stat Assoc. 2021;116(535):1307-1318. doi: 10.1080/01621459.2019.1710154. Epub 2020 Jan 23.

引用本文的文献

Integrating bioinformatics analysis, machine learning, and experimental validation to identify pyroptosis-related genes in the diagnosis of sepsis combined with acute liver failure.整合生物信息学分析、机器学习和实验验证，以鉴定脓毒症合并急性肝衰竭诊断中与细胞焦亡相关的基因。

Hereditas. 2025 Aug 8;162(1):153. doi: 10.1186/s41065-025-00522-4.

Quantile forward regression for high-dimensional survival data.高维生存数据的分位数向前回归

Lifetime Data Anal. 2023 Oct;29(4):769-806. doi: 10.1007/s10985-023-09603-w. Epub 2023 Jul 2.

High-Dimensional Survival Analysis: Methods and Applications.高维生存分析：方法与应用

Annu Rev Stat Appl. 2023 Mar;10(1):25-49. doi: 10.1146/annurev-statistics-032921-022127. Epub 2022 Oct 6.

Smart triage: Development of a rapid pediatric triage algorithm for use in low-and-middle income countries.智能分诊：一种用于低收入和中等收入国家的快速儿科分诊算法的开发。

Front Pediatr. 2022 Nov 22;10:976870. doi: 10.3389/fped.2022.976870. eCollection 2022.

本文引用的文献

High expression of cytokeratin CAM5.2 in esophageal squamous cell carcinoma is associated with poor prognosis.细胞角蛋白CAM5.2在食管鳞状细胞癌中的高表达与预后不良相关。

Medicine (Baltimore). 2019 Sep;98(37):e17104. doi: 10.1097/MD.0000000000017104.

Building generalized linear models with ultrahigh dimensional features: A sequentially conditional approach.超高维特征的广义线性模型构建：一种序贯条件方法。

Biometrics. 2020 Mar;76(1):47-60. doi: 10.1111/biom.13122. Epub 2019 Nov 6.

Development and Validation of an Esophageal Squamous Cell Carcinoma Detection Model by Large-Scale MicroRNA Profiling.基于大规模 microRNA 谱分析的食管鳞状细胞癌检测模型的建立与验证。

JAMA Netw Open. 2019 May 3;2(5):e194573. doi: 10.1001/jamanetworkopen.2019.4573.

Forward regression for Cox models with high-dimensional covariates.具有高维协变量的Cox模型的向前回归

J Multivar Anal. 2019 Sep;173:268-290. doi: 10.1016/j.jmva.2019.02.011. Epub 2019 Mar 5.

Big data hurdles in precision medicine and precision public health.精准医学和精准公共卫生中的大数据障碍。

BMC Med Inform Decis Mak. 2018 Dec 29;18(1):139. doi: 10.1186/s12911-018-0719-2.

Serum levels of miR-320 family members are associated with clinical parameters and diagnosis in prostate cancer patients.前列腺癌患者血清中miR-320家族成员水平与临床参数及诊断相关。

Oncotarget. 2017 Dec 30;9(12):10402-10416. doi: 10.18632/oncotarget.23781. eCollection 2018 Feb 13.

Conditional Sure Independence Screening.条件确定独立性筛选

J Am Stat Assoc. 2016;111(515):1266-1277. doi: 10.1080/01621459.2015.1092974. Epub 2016 Oct 18.

Association of cigarette smoking and microRNA expression in rectal cancer: Insight into tumor phenotype.直肠癌中吸烟与微小RNA表达的关联：对肿瘤表型的洞察

Cancer Epidemiol. 2016 Dec;45:98-107. doi: 10.1016/j.canep.2016.10.011. Epub 2016 Oct 22.

Variable Selection with Prior Information for Generalized Linear Models via the Prior LASSO Method.通过先验套索方法对广义线性模型进行带先验信息的变量选择

J Am Stat Assoc. 2016;111(513):355-376. doi: 10.1080/01621459.2015.1008363. Epub 2016 May 5.

Regularization Paths for Cox's Proportional Hazards Model via Coordinate Descent.通过坐标下降法求解Cox比例风险模型的正则化路径

J Stat Softw. 2011 Mar;39(5):1-13. doi: 10.18637/jss.v039.i05.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

通过逐步回归对具有高维预测变量的广义线性模型进行一致估计。

Consistent Estimation of Generalized Linear Models with High Dimensional Predictors via Stepwise Regression.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献