Fowler Charlotte, Cai Xiaoxuan, Baker Justin T, Onnela Jukka-Pekka, Valeri Linda
Department of Biostatistics, Mailman School of Public Health, Columbia University, New York, NY, USA.
Department of Statistics, The Ohio State University, Columbus, OH, USA.
J R Stat Soc Ser C Appl Stat. 2024 Feb 29;73(3):755-773. doi: 10.1093/jrsssc/qlae010. eCollection 2024 Jun.
The use of digital devices to collect data in mobile health studies introduces a novel application of time series methods, with the constraint of potential data missing at random or missing not at random (MNAR). In time-series analysis, testing for stationarity is an important preliminary step to inform appropriate subsequent analyses. The Dickey-Fuller test evaluates the null hypothesis of unit root non-stationarity, under no missing data. Beyond recommendations under data missing completely at random for complete case analysis or last observation carry forward imputation, researchers have not extended unit root non-stationarity testing to more complex missing data mechanisms. Multiple imputation with chained equations, Kalman smoothing imputation, and linear interpolation have also been used for time-series data, however such methods impose constraints on the autocorrelation structure and impact unit root testing. We propose maximum likelihood estimation and multiple imputation using state space model approaches to adapt the augmented Dickey-Fuller test to a context with missing data. We further develop sensitivity analyses to examine the impact of MNAR data. We evaluate the performance of existing and proposed methods across missing mechanisms in extensive simulations and in their application to a multi-year smartphone study of bipolar patients.
在移动健康研究中使用数字设备收集数据引入了时间序列方法的一种新应用,存在潜在数据随机缺失或非随机缺失(MNAR)的限制。在时间序列分析中,平稳性检验是为后续适当分析提供依据的重要初步步骤。迪基 - 富勒检验在无数据缺失的情况下评估单位根非平稳性的原假设。除了针对完全随机缺失数据的完全病例分析或末次观察值结转插补的建议外,研究人员尚未将单位根非平稳性检验扩展到更复杂的缺失数据机制。使用链式方程的多重插补、卡尔曼平滑插补和线性插值也已用于时间序列数据,然而这些方法对自相关结构施加了限制并影响单位根检验。我们提出使用状态空间模型方法进行最大似然估计和多重插补,以使增强的迪基 - 富勒检验适用于存在缺失数据的情况。我们进一步开展敏感性分析以检验MNAR数据的影响。我们在广泛的模拟中以及在其应用于双相情感障碍患者的多年智能手机研究中,评估现有方法和所提出方法在不同缺失机制下的性能。