Suppr超能文献

使用稀疏在线高斯过程双向插补具有缺失值的空间 GPS 轨迹。

Bidirectional imputation of spatial GPS trajectories with missingness using sparse online Gaussian Process.

机构信息

Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, USA.

出版信息

J Am Med Inform Assoc. 2021 Jul 30;28(8):1777-1784. doi: 10.1093/jamia/ocab069.

Abstract

OBJECTIVE

We propose a bidirectional GPS imputation method that can recover real-world mobility trajectories even when a substantial proportion of the data are missing. The time complexity of our online method is linear in the sample size, and it provides accurate estimates on daily or hourly summary statistics such as time spent at home and distance traveled.

MATERIALS AND METHODS

To preserve a smartphone's battery, GPS may be sampled only for a small portion of time, frequently <10%, which leads to a substantial missing data problem. We developed an algorithm that simulates an individual's trajectory based on observed GPS location traces using sparse online Gaussian Process to addresses the high computational complexity of the existing method. The method also retains the spherical geometry of the problem, and imputes the missing trajectory in a bidirectional fashion with multiple condition checks to improve accuracy.

RESULTS

We demonstrated that (1) the imputed trajectories mimic the real-world trajectories, (2) the confidence intervals of summary statistics cover the ground truth in most cases, and (3) our algorithm is much faster than existing methods if we have more than 3 months of observations; (4) we also provide guidelines on optimal sampling strategies.

CONCLUSIONS

Our approach outperformed existing methods and was significantly faster. It can be used in settings in which data need to be analyzed and acted on continuously, for example, to detect behavioral anomalies that might affect treatment adherence, or to learn about colocations of individuals during an epidemic.

摘要

目的

我们提出了一种双向 GPS 插补方法,即使数据缺失的比例较大,也可以恢复真实的移动轨迹。我们的在线方法的时间复杂度与样本量呈线性关系,可以准确估计日常或每小时的汇总统计数据,如在家时间和行驶距离。

材料与方法

为了节省智能手机电池,GPS 采样时间可能很短,通常 <10%,这会导致大量数据缺失问题。我们开发了一种算法,该算法使用稀疏在线高斯过程根据观察到的 GPS 位置轨迹模拟个体轨迹,以解决现有方法计算复杂度高的问题。该方法还保留了问题的球面几何结构,并通过多次条件检查以双向方式插补缺失轨迹,以提高准确性。

结果

我们证明了(1)插补轨迹模拟真实世界轨迹,(2)汇总统计数据的置信区间在大多数情况下覆盖真实值,(3)如果我们有超过 3 个月的观测数据,我们的算法比现有方法快得多;(4)我们还提供了最佳采样策略的指南。

结论

我们的方法优于现有方法,速度也快得多。它可用于需要连续分析和采取行动的环境中,例如,检测可能影响治疗依从性的行为异常,或了解流行病期间个体的聚集情况。

相似文献

1
Bidirectional imputation of spatial GPS trajectories with missingness using sparse online Gaussian Process.
J Am Med Inform Assoc. 2021 Jul 30;28(8):1777-1784. doi: 10.1093/jamia/ocab069.
2
Inferring mobility measures from GPS traces with missing data.
Biostatistics. 2020 Apr 1;21(2):e98-e112. doi: 10.1093/biostatistics/kxy059.
3
Comparison of GPS imputation methods in environmental health research.
Geospat Health. 2022 Aug 29;17(2). doi: 10.4081/gh.2022.1081.
4
Generative adversarial networks for imputing missing data for big data clinical research.
BMC Med Res Methodol. 2021 Apr 20;21(1):78. doi: 10.1186/s12874-021-01272-3.
6
A nonparametric multiple imputation approach for missing categorical data.
BMC Med Res Methodol. 2017 Jun 6;17(1):87. doi: 10.1186/s12874-017-0360-2.
7
Meta-analysis of test accuracy studies using imputation for partial reporting of multiple thresholds.
Res Synth Methods. 2018 Mar;9(1):100-115. doi: 10.1002/jrsm.1276. Epub 2017 Nov 22.
8
Multiple imputation with sequential penalized regression.
Stat Methods Med Res. 2019 May;28(5):1311-1327. doi: 10.1177/0962280218755574. Epub 2018 Feb 16.
10
missForest with feature selection using binary particle swarm optimization improves the imputation accuracy of continuous data.
Genes Genomics. 2022 Jun;44(6):651-658. doi: 10.1007/s13258-022-01247-8. Epub 2022 Apr 6.

引用本文的文献

2
7
Tracking amyotrophic lateral sclerosis disease progression using passively collected smartphone sensor data.
Ann Clin Transl Neurol. 2024 Jun;11(6):1380-1392. doi: 10.1002/acn3.52050. Epub 2024 May 30.
8
Statistical inference for complete and incomplete mobility trajectories under the flight-pause model.
J R Stat Soc Ser C Appl Stat. 2023 Nov 2;73(1):162-192. doi: 10.1093/jrsssc/qlad090. eCollection 2024 Jan.
9
Sociodemographic characteristics of missing data in digital phenotyping.
Sci Rep. 2021 Jul 29;11(1):15408. doi: 10.1038/s41598-021-94516-7.

本文引用的文献

1
Smartphone Global Positioning System (GPS) Data Enhances Recovery Assessment After Breast Cancer Surgery.
Ann Surg Oncol. 2021 Feb;28(2):985-994. doi: 10.1245/s10434-020-09004-5. Epub 2020 Aug 18.
2
Novel approaches to estimate compliance with lockdown measures in the COVID-19 pandemic.
J Glob Health. 2020 Jun;10(1):010348. doi: 10.7189/jogh.10.010348.
3
Design and results of a smartphone-based digital phenotyping study to quantify ALS progression.
Ann Clin Transl Neurol. 2019 Apr 3;6(5):873-881. doi: 10.1002/acn3.770. eCollection 2019 May.
4
Trends in Sedentary Behavior Among the US Population, 2001-2016.
JAMA. 2019 Apr 23;321(16):1587-1597. doi: 10.1001/jama.2019.3636.
5
Digital Phenotyping in Patients with Spine Disease: A Novel Approach to Quantifying Mobility and Quality of Life.
World Neurosurg. 2019 Jun;126:e241-e249. doi: 10.1016/j.wneu.2019.01.297. Epub 2019 Feb 22.
6
Inferring mobility measures from GPS traces with missing data.
Biostatistics. 2020 Apr 1;21(2):e98-e112. doi: 10.1093/biostatistics/kxy059.
7
Relapse prediction in schizophrenia through digital phenotyping: a pilot study.
Neuropsychopharmacology. 2018 Jul;43(8):1660-1666. doi: 10.1038/s41386-018-0030-z. Epub 2018 Feb 22.
8
Large-scale physical activity data reveal worldwide activity inequality.
Nature. 2017 Jul 20;547(7663):336-339. doi: 10.1038/nature23018. Epub 2017 Jul 10.
10
The mobile revolution--using smartphone apps to prevent cardiovascular disease.
Nat Rev Cardiol. 2015 Jun;12(6):350-60. doi: 10.1038/nrcardio.2015.34. Epub 2015 Mar 24.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验