Ni A I, Cai Jianwen, Zeng Donglin
3101 McGavran-Greenberg Hall, CB 7420, Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27599, U.S.A.
Biometrika. 2016 Sep;103(3):547-562. doi: 10.1093/biomet/asw027. Epub 2016 Aug 10.
Case-cohort designs are widely used in large cohort studies to reduce the cost associated with covariate measurement. In many such studies the number of covariates is very large, so an efficient variable selection method is necessary. In this paper, we study the properties of a variable selection procedure using the smoothly clipped absolute deviation penalty in a case-cohort design with a diverging number of parameters. We establish the consistency and asymptotic normality of the maximum penalized pseudo-partial-likelihood estimator, and show that the proposed variable selection method is consistent and has an asymptotic oracle property. Simulation studies compare the finite-sample performance of the procedure with tuning parameter selection methods based on the Akaike information criterion and the Bayesian information criterion. We make recommendations for use of the proposed procedures in case-cohort studies, and apply them to the Busselton Health Study.
病例队列设计在大型队列研究中被广泛应用,以降低与协变量测量相关的成本。在许多此类研究中,协变量的数量非常大,因此需要一种有效的变量选择方法。在本文中,我们研究了在参数数量发散的病例队列设计中使用平滑截断绝对偏差惩罚的变量选择程序的性质。我们建立了最大惩罚伪偏似然估计量的一致性和渐近正态性,并表明所提出的变量选择方法是一致的,并且具有渐近最优性质。模拟研究将该程序与基于赤池信息准则和贝叶斯信息准则的调谐参数选择方法的有限样本性能进行了比较。我们对在病例队列研究中使用所提出的程序提出了建议,并将其应用于巴瑟尔顿健康研究。