Machine Learning and Data Analytics Lab, Computer Science Department, 91052 Erlangen, Germany.
Faculty of Computer and Information Science, University of Ljubljana, 1000 Ljubljana, Slovenia.
Sensors (Basel). 2019 Apr 16;19(8):1820. doi: 10.3390/s19081820.
Activity monitoring using wearables is becoming ubiquitous, although accurate cycle level analysis, such as step-counting and gait analysis, are limited by a lack of realistic and labeled datasets. The effort required to obtain and annotate such datasets is massive, therefore we propose a smart annotation pipeline which reduces the number of events needing manual adjustment to 14%. For scenarios dominated by walking, this annotation effort is as low as 8%. The pipeline consists of three smart annotation approaches, namely edge detection of the pressure data, local cyclicity estimation, and iteratively trained hierarchical hidden Markov models. Using this pipeline, we have collected and labeled a dataset with over 150,000 labeled cycles, each with 2 phases, from 80 subjects, which we have made publicly available. The dataset consists of 12 different task-driven activities, 10 of which are cyclic. These activities include not only straight and steady-state motions, but also transitions, different ranges of bouts, and changing directions. Each participant wore 5 synchronized inertial measurement units (IMUs) on the wrists, shoes, and in a pocket, as well as pressure insoles and video. We believe that this dataset and smart annotation pipeline are a good basis for creating a benchmark dataset for validation of other semi- and unsupervised algorithms.
使用可穿戴设备进行活动监测已经变得无处不在,尽管准确的周期水平分析,如步数计数和步态分析,受到缺乏现实和标记数据集的限制。获取和注释此类数据集所需的工作量非常大,因此我们提出了一个智能注释管道,将需要手动调整的事件数量减少到 14%。对于以步行为主的场景,这种注释工作量低至 8%。该管道由三种智能注释方法组成,即压力数据的边缘检测、局部周期性估计和迭代训练的层次隐马尔可夫模型。使用这个管道,我们已经从 80 个对象中收集和标记了超过 150000 个带有 2 个阶段的标记周期的数据,这些数据我们已经公开了。该数据集由 12 种不同的任务驱动的活动组成,其中 10 种是周期性的。这些活动不仅包括直线和稳态运动,还包括过渡、不同的回合范围和改变方向。每个参与者在手腕、鞋子和口袋里佩戴了 5 个同步的惯性测量单元(IMU),以及压力鞋垫和视频。我们相信,这个数据集和智能注释管道是创建用于验证其他半监督和无监督算法的基准数据集的良好基础。