缺失数据统计为可穿戴传感器监测糖尿病健康数据丢失提供了因果见解。

Missing Data Statistics Provide Causal Insights into Data Loss in Diabetes Health Monitoring by Wearable Sensors.

机构信息

Department of Biomedical Signals and Systems, University of Twente, 7522 NB Enschede, The Netherlands.

出版信息

Sensors (Basel). 2024 Feb 27;24(5):1526. doi: 10.3390/s24051526.

Abstract

BACKGROUND

Data loss in wearable sensors is an inevitable problem that leads to misrepresentation during diabetes health monitoring. We systematically investigated missing wearable sensors data to get causal insight into the mechanisms leading to missing data.

METHODS

Two-week-long data from a continuous glucose monitor and a Fitbit activity tracker recording heart rate (HR) and step count in free-living patients with type 2 diabetes mellitus were used. The gap size distribution was fitted with a Planck distribution to test for missing not at random (MNAR) and a difference between distributions was tested with a Chi-squared test. Significant missing data dispersion over time was tested with the Kruskal-Wallis test and Dunn post hoc analysis.

RESULTS

Data from 77 subjects resulted in 73 cleaned glucose, 70 HR and 68 step count recordings. The glucose gap sizes followed a Planck distribution. HR and step count gap frequency differed significantly ( < 0.001), and the missing data were therefore MNAR. In glucose, more missing data were found in the night (23:00-01:00), and in step count, more at measurement days 6 and 7 ( < 0.001). In both cases, missing data were caused by insufficient frequency of data synchronization.

CONCLUSIONS

Our novel approach of investigating missing data statistics revealed the mechanisms for missing data in Fitbit and CGM data.

摘要

背景

可穿戴传感器的数据丢失是一个不可避免的问题,会导致糖尿病健康监测过程中的数据失真。我们系统地研究了可穿戴传感器丢失的数据,以深入了解导致数据丢失的机制。

方法

使用来自 77 名 2 型糖尿病患者的连续葡萄糖监测仪和 Fitbit 活动追踪器记录的 2 周心率(HR)和步数的自由生活数据。使用泊松分布拟合缺口大小分布,以测试是否存在非随机缺失(MNAR),并使用卡方检验测试分布之间的差异。使用克鲁斯卡尔-沃利斯检验和邓恩事后分析测试随时间显著的缺失数据分散情况。

结果

77 名受试者的数据中,有 73 个清洁的葡萄糖、70 个 HR 和 68 个步计数记录。葡萄糖缺口大小符合泊松分布。HR 和步计数缺口频率差异显著(<0.001),因此缺失数据为 MNAR。在葡萄糖中,夜间(23:00-01:00)发现更多的缺失数据,而在步计数中,在第 6 天和第 7 天(<0.001)发现更多的缺失数据。在这两种情况下,缺失数据是由于数据同步频率不足引起的。

结论

我们研究缺失数据统计的新方法揭示了 Fitbit 和 CGM 数据中缺失数据的机制。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/67a5/10935383/65281ffa0d7f/sensors-24-01526-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索