Pijyan Alex, Zheng Qi, Hong Hyokyoung G, Li Yi
Department of Statistics and Probability, Michigan State University, East Lansing, MI 48824, USA.
Department of Bioinformatics and Biostatistics, University of Louisville, Louisville, KY 40202, USA.
Entropy (Basel). 2020 Aug 31;22(9):965. doi: 10.3390/e22090965.
Predictive models play a central role in decision making. Penalized regression approaches, such as least absolute shrinkage and selection operator (LASSO), have been widely used to construct predictive models and explain the impacts of the selected predictors, but the estimates are typically biased. Moreover, when data are ultrahigh-dimensional, penalized regression is usable only after applying variable screening methods to downsize variables. We propose a stepwise procedure for fitting generalized linear models with ultrahigh dimensional predictors. Our procedure can provide a final model; control both false negatives and false positives; and yield consistent estimates, which are useful to gauge the actual effect size of risk factors. Simulations and applications to two clinical studies verify the utility of the method.
预测模型在决策中起着核心作用。惩罚回归方法,如最小绝对收缩和选择算子(LASSO),已被广泛用于构建预测模型并解释所选预测变量的影响,但估计通常存在偏差。此外,当数据是超高维时,惩罚回归只有在应用变量筛选方法来减少变量数量后才能使用。我们提出了一种用于拟合具有超高维预测变量的广义线性模型的逐步程序。我们的程序可以提供一个最终模型;控制假阴性和假阳性;并产生一致的估计,这对于衡量风险因素的实际效应大小很有用。对两项临床研究的模拟和应用验证了该方法的实用性。