Big Data Institute, University of Oxford, Oxford, UK.
Nuffield Department of Population Health, University of Oxford, Oxford, UK.
Sci Data. 2024 Oct 16;11(1):1135. doi: 10.1038/s41597-024-03960-3.
Existing activity tracker datasets for human activity recognition are typically obtained by having participants perform predefined activities in an enclosed environment under supervision. This results in small datasets with a limited number of activities and heterogeneity, lacking the mixed and nuanced movements normally found in free-living scenarios. As such, models trained on laboratory-style datasets may not generalise out of sample. To address this problem, we introduce a new dataset involving wrist-worn accelerometers, wearable cameras, and sleep diaries, enabling data collection for over 24 hours in a free-living setting. The result is CAPTURE-24, a large activity tracker dataset collected in the wild from 151 participants, amounting to 3883 hours of accelerometer data, of which 2562 hours are annotated. CAPTURE-24 is two to three orders of magnitude larger than existing publicly available datasets, which is critical to developing accurate human activity recognition models.
现有的用于人体活动识别的活动跟踪器数据集通常是通过让参与者在监督下的封闭环境中执行预定义的活动来获得的。这导致数据集规模较小,活动数量有限且存在异质性,缺乏在自由生活场景中通常发现的混合和细微的动作。因此,在实验室风格的数据上训练的模型可能无法在样本外进行泛化。为了解决这个问题,我们引入了一个新的数据集,涉及腕戴式加速度计、可穿戴式摄像头和睡眠日记,从而能够在自由生活环境中进行超过 24 小时的数据收集。其结果是 CAPTURE-24,这是一个从 151 名参与者中在野外收集的大型活动跟踪器数据集,共包含 3883 小时的加速度计数据,其中 2562 小时是有注释的。 CAPTURE-24 比现有的公开可用数据集大两到三个数量级,这对于开发准确的人体活动识别模型至关重要。