Suppr超能文献

影响可穿戴设备数据质量的因素及相关挑战:快速系统综述。

Factors Affecting the Quality of Person-Generated Wearable Device Data and Associated Challenges: Rapid Systematic Review.

机构信息

Department of Biomedical informatics, Columbia University, New York, NY, United States.

Data Science Institute, Columbia University, New York, NY, United States.

出版信息

JMIR Mhealth Uhealth. 2021 Mar 19;9(3):e20738. doi: 10.2196/20738.

Abstract

BACKGROUND

There is increasing interest in reusing person-generated wearable device data for research purposes, which raises concerns about data quality. However, the amount of literature on data quality challenges, specifically those for person-generated wearable device data, is sparse.

OBJECTIVE

This study aims to systematically review the literature on factors affecting the quality of person-generated wearable device data and their associated intrinsic data quality challenges for research.

METHODS

The literature was searched in the PubMed, Association for Computing Machinery, Institute of Electrical and Electronics Engineers, and Google Scholar databases by using search terms related to wearable devices and data quality. By using PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines, studies were reviewed to identify factors affecting the quality of wearable device data. Studies were eligible if they included content on the data quality of wearable devices, such as fitness trackers and sleep monitors. Both research-grade and consumer-grade wearable devices were included in the review. Relevant content was annotated and iteratively categorized into semantically similar factors until a consensus was reached. If any data quality challenges were mentioned in the study, those contents were extracted and categorized as well.

RESULTS

A total of 19 papers were included in this review. We identified three high-level factors that affect data quality-device- and technical-related factors, user-related factors, and data governance-related factors. Device- and technical-related factors include problems with hardware, software, and the connectivity of the device; user-related factors include device nonwear and user error; and data governance-related factors include a lack of standardization. The identified factors can potentially lead to intrinsic data quality challenges, such as incomplete, incorrect, and heterogeneous data. Although missing and incorrect data are widely known data quality challenges for wearable devices, the heterogeneity of data is another aspect of data quality that should be considered for wearable devices. Heterogeneity in wearable device data exists at three levels: heterogeneity in data generated by a single person using a single device (within-person heterogeneity); heterogeneity in data generated by multiple people who use the same brand, model, and version of a device (between-person heterogeneity); and heterogeneity in data generated from multiple people using different devices (between-person heterogeneity), which would apply especially to data collected under a bring-your-own-device policy.

CONCLUSIONS

Our study identifies potential intrinsic data quality challenges that could occur when analyzing wearable device data for research and three major contributing factors for these challenges. As poor data quality can compromise the reliability and accuracy of research results, further investigation is needed on how to address the data quality challenges of wearable devices.

摘要

背景

人们对重新利用个人生成的可穿戴设备数据进行研究的兴趣日益浓厚,这引发了人们对数据质量的担忧。然而,关于数据质量挑战的文献数量很少,特别是针对个人生成的可穿戴设备数据的文献。

目的

本研究旨在系统地回顾文献,以了解影响个人生成的可穿戴设备数据质量的因素,以及这些因素给研究带来的内在数据质量挑战。

方法

通过使用与可穿戴设备和数据质量相关的搜索词,在 PubMed、Association for Computing Machinery、Institute of Electrical and Electronics Engineers 和 Google Scholar 数据库中进行文献检索。根据 PRISMA(系统评价和荟萃分析的首选报告项目)指南,对研究进行了回顾,以确定影响可穿戴设备数据质量的因素。如果研究内容涉及可穿戴设备(如健身追踪器和睡眠监测器)的数据质量,则研究合格。本综述纳入了研究级和消费级可穿戴设备。对相关内容进行标注,并通过迭代分类将其归入语义相似的因素,直到达成共识。如果研究中提到了任何数据质量挑战,则将这些内容提取并分类。

结果

本综述共纳入 19 篇论文。我们确定了三个影响数据质量的高级因素:设备和技术相关因素、用户相关因素以及数据治理相关因素。设备和技术相关因素包括硬件、软件和设备连接问题;用户相关因素包括设备未佩戴和用户错误;数据治理相关因素包括缺乏标准化。这些因素可能导致数据质量的内在挑战,例如数据不完整、不正确和异构。尽管缺失和错误数据是广为人知的可穿戴设备数据质量挑战,但数据的异质性是另一个需要考虑的可穿戴设备数据质量方面。可穿戴设备数据的异质性存在于三个层面:使用单个设备的单个人员生成的数据的异质性(个体内异质性);使用同一品牌、型号和版本设备的多个人生成的数据的异质性(个体间异质性);以及使用不同设备的多个人生成的数据的异质性(个体间异质性),这尤其适用于自带设备政策下收集的数据。

结论

本研究确定了在分析可穿戴设备数据进行研究时可能出现的潜在内在数据质量挑战,以及导致这些挑战的三个主要因素。由于数据质量差会影响研究结果的可靠性和准确性,因此需要进一步研究如何解决可穿戴设备的数据质量挑战。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aa4d/8294465/ca833180dcf4/mhealth_v9i3e20738_fig1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验