Centre for Epidemiology Versus Arthritis, Manchester Academic Health Science Centre, University of Manchester, Manchester, United Kingdom.
Department of Biostatistics, Harvard T. H. Chan School of Public Health, Harvard University, Boston, MA, United States.
JMIR Mhealth Uhealth. 2021 Nov 16;9(11):e28857. doi: 10.2196/28857.
Smartphone location data can be used for observational health studies (to determine participant exposure or behavior) or to deliver a location-based health intervention. However, missing location data are more common when using smartphones compared to when using research-grade location trackers. Missing location data can affect study validity and intervention safety.
The objective of this study was to investigate the distribution of missing location data and its predictors to inform design, analysis, and interpretation of future smartphone (observational and interventional) studies.
We analyzed hourly smartphone location data collected from 9665 research participants on 488,400 participant days in a national smartphone study investigating the association between weather conditions and chronic pain in the United Kingdom. We used a generalized mixed-effects linear model with logistic regression to identify whether a successfully recorded geolocation was associated with the time of day, participants' time in study, operating system, time since previous survey completion, participant age, sex, and weather sensitivity.
For most participants, the app collected a median of 2 out of a maximum of 24 locations (1760/9665, 18.2% of participants), no location data (1664/9665, 17.2%), or complete location data (1575/9665, 16.3%). The median locations per day differed by the operating system: participants with an Android phone most often had complete data (a median of 24/24 locations) whereas iPhone users most often had a median of 2 out of 24 locations. The odds of a successfully recorded location for Android phones were 22.91 times higher than those for iPhones (95% CI 19.53-26.87). The odds of a successfully recorded location were lower during weekends (odds ratio [OR] 0.94, 95% CI 0.94-0.95) and nights (OR 0.37, 95% CI 0.37-0.38), if time in study was longer (OR 0.99 per additional day in study, 95% CI 0.99-1.00), and if a participant had not used the app recently (OR 0.96 per additional day since last survey entry, 95% CI 0.96-0.96). Participant age and sex did not predict missing location data.
The predictors of missing location data reported in our study could inform app settings and user instructions for future smartphone (observational and interventional) studies. These predictors have implications for analysis methods to deal with missing location data, such as imputation of missing values or case-only analysis. Health studies using smartphones for data collection should assess context-specific consequences of high missing data, especially among iPhone users, during the night and for disengaged participants.
智能手机位置数据可用于观察性健康研究(以确定参与者的暴露或行为)或提供基于位置的健康干预。然而,与使用研究级位置跟踪器相比,使用智能手机时更常见的是缺少位置数据。缺少位置数据会影响研究的有效性和干预的安全性。
本研究旨在调查缺失位置数据的分布及其预测因素,为未来智能手机(观察性和干预性)研究的设计、分析和解释提供信息。
我们分析了在英国一项关于天气条件与慢性疼痛之间关联的全国性智能手机研究中,9665 名研究参与者在 488400 个参与者日中每小时的智能手机位置数据。我们使用广义混合效应线性模型和逻辑回归来确定成功记录的地理位置是否与一天中的时间、参与者在研究中的时间、操作系统、上次完成调查后的时间、参与者年龄、性别和天气敏感性有关。
对于大多数参与者来说,应用程序最多可收集 24 个位置中的 2 个位置数据(9665 名参与者中的 1760 名,占 18.2%)、没有位置数据(9665 名参与者中的 1664 名,占 17.2%)或完整的位置数据(9665 名参与者中的 1575 名,占 16.3%)。每天的位置中位数因操作系统而异:使用 Android 手机的参与者通常拥有完整的数据(中位数为 24/24 个位置),而 iPhone 用户通常拥有中位数为 2 个位置。Android 手机成功记录位置的几率是 iPhone 的 22.91 倍(95%CI 19.53-26.87)。周末(OR 0.94,95%CI 0.94-0.95)和夜间(OR 0.37,95%CI 0.37-0.38)成功记录位置的几率较低,如果研究时间较长(OR 0.99 每增加一天的研究,95%CI 0.99-1.00),如果参与者最近没有使用该应用程序(OR 0.96 每增加一天自上次调查输入以来,95%CI 0.96-0.96)。参与者年龄和性别不能预测缺失的位置数据。
我们研究中报告的缺失位置数据的预测因素可以为未来的智能手机(观察性和干预性)研究的应用程序设置和用户说明提供信息。这些预测因素对处理缺失位置数据的分析方法有影响,例如缺失值的插补或仅案例分析。使用智能手机收集数据的健康研究应评估特定于上下文的高缺失数据的后果,特别是在夜间和不参与的参与者中,使用 iPhone 的用户。