Saegusa Takumi, Ma Tianzhou, Li Gang, Chen Ying Qing, Lee Mei-Ling Ting
Department of Biostatistics, University of Maryland, College Park MD 20742.
Department of Epidemiology and Biostatistics, University of Maryland, College Park MD 20742.
Stat Biosci. 2020 Dec;12(3):376-398. doi: 10.1007/s12561-020-09284-1. Epub 2020 Jun 17.
The threshold regression model is an effective alternative to the Cox proportional hazards regression model when the proportional hazards assumption is not met. This paper considers variable selection for threshold regression. This model has separate regression functions for the initial health status and the speed of degradation in health. This flexibility is an important advantage when considering relevant risk factors for a complex time-to-event model where one needs to decide which variables should be included in the regression function for the initial health status, in the function for the speed of degradation in health, or in both functions. In this paper, we extend the broken adaptive ridge (BAR) method, originally designed for variable selection for one regression function, to simultaneous variable selection for both regression functions needed in the threshold regression model. We establish variable selection consistency of the proposed method and asymptotic normality of the estimator of non-zero regression coefficients. Simulation results show that our method outperformed threshold regression without variable selection and variable selection based on the Akaike information criterion. We apply the proposed method to data from an HIV drug adherence study in which electronic monitoring of drug intake is used to identify risk factors for non- adherence.
当比例风险假设不成立时,阈值回归模型是Cox比例风险回归模型的一种有效替代方法。本文考虑阈值回归的变量选择。该模型对初始健康状况和健康状况恶化速度具有单独的回归函数。在考虑复杂的事件发生时间模型的相关风险因素时,这种灵活性是一个重要优势,在这种模型中,需要决定哪些变量应包含在初始健康状况的回归函数中、健康状况恶化速度的函数中,或者两个函数中。在本文中,我们将最初为一个回归函数的变量选择而设计的分段自适应岭(BAR)方法扩展到阈值回归模型所需的两个回归函数的同时变量选择。我们建立了所提出方法的变量选择一致性以及非零回归系数估计量的渐近正态性。模拟结果表明,我们的方法优于无变量选择的阈值回归以及基于赤池信息准则的变量选择。我们将所提出的方法应用于一项HIV药物依从性研究的数据,该研究使用药物摄入的电子监测来识别不依从的风险因素。