Shen Jianzhao, Gao Sujuan
Division of Biostatistics, Department of Medicine, Indiana University School of Medicine, 1050 Wishard Blvd. RG4101, Indianapolis, IN 46202-2872, USA.
J Data Sci. 2008 Oct 1;6(4):515-531.
In dementia screening tests, item selection for shortening an existing screening test can be achieved using multiple logistic regression. However, maximum likelihood estimates for such logistic regression models often experience serious bias or even non-existence because of separation and multicollinearity problems resulting from a large number of highly correlated items. Firth (1993, Biometrika, 80(1), 27-38) proposed a penalized likelihood estimator for generalized linear models and it was shown to reduce bias and the non-existence problems. The ridge regression has been used in logistic regression to stabilize the estimates in cases of multicollinearity. However, neither solves the problems for each other. In this paper, we propose a double penalized maximum likelihood estimator combining Firth's penalized likelihood equation with a ridge parameter. We present a simulation study evaluating the empirical performance of the double penalized likelihood estimator in small to moderate sample sizes. We demonstrate the proposed approach using a current screening data from a community-based dementia study.
在痴呆筛查测试中,可使用多元逻辑回归来选择项目以缩短现有的筛查测试。然而,由于大量高度相关项目导致的分离和多重共线性问题,此类逻辑回归模型的最大似然估计常常会出现严重偏差甚至不存在。弗思(1993年,《生物统计学》,80(1),27 - 38)提出了一种广义线性模型的惩罚似然估计器,结果表明它能减少偏差和不存在问题。岭回归已用于逻辑回归,以在多重共线性情况下稳定估计。然而,两者都无法解决对方的问题。在本文中,我们提出一种将弗思的惩罚似然方程与一个岭参数相结合的双重惩罚最大似然估计器。我们进行了一项模拟研究,评估双重惩罚似然估计器在中小样本量情况下的实证性能。我们使用来自一项基于社区的痴呆研究的当前筛查数据展示了所提出的方法。