Department of Population Health, New York University School of Medicine, New York, New York.
Department of Statistics, Carnegie Mellon University, Pittsburgh, Pennsylvania.
Stat Med. 2019 May 30;38(12):2184-2205. doi: 10.1002/sim.8100. Epub 2019 Jan 30.
We study regularized estimation in high-dimensional longitudinal classification problems, using the lasso and fused lasso regularizers. The constructed coefficient estimates are piecewise constant across the time dimension in the longitudinal problem, with adaptively selected change points (break points). We present an efficient algorithm for computing such estimates, based on proximal gradient descent. We apply our proposed technique to a longitudinal data set on Alzheimer's disease from the Cardiovascular Health Study Cognition Study. Using data analysis and a simulation study, we motivate and demonstrate several practical considerations such as the selection of tuning parameters and the assessment of model stability. While race, gender, vascular and heart disease, lack of caregivers, and deterioration of learning and memory are all important predictors of dementia, we also find that these risk factors become more relevant in the later stages of life.
我们研究了高维纵向分类问题中的正则化估计,使用了 lasso 和融合 lasso 正则化器。在纵向问题中,构建的系数估计在时间维度上是分段常数的,具有自适应选择的变化点(断点)。我们提出了一种基于近端梯度下降的计算这种估计的有效算法。我们将我们提出的技术应用于来自心血管健康研究认知研究的阿尔茨海默病的纵向数据集。通过数据分析和模拟研究,我们提出并演示了一些实际考虑因素,例如调整参数的选择和模型稳定性的评估。虽然种族、性别、血管和心脏病、缺乏照顾者以及学习和记忆能力的恶化都是痴呆症的重要预测因素,但我们也发现这些风险因素在生命的后期变得更加相关。