Saez Yago, Baldominos Alejandro, Isasi Pedro
Department of Computer Science, Universidad Carlos III de Madrid, 28911 Leganés, Spain.
Sensors (Basel). 2016 Dec 30;17(1):66. doi: 10.3390/s17010066.
Physical activity is widely known to be one of the key elements of a healthy life. The many benefits of physical activity described in the medical literature include weight loss and reductions in the risk factors for chronic diseases. With the recent advances in wearable devices, such as smartwatches or physical activity wristbands, motion tracking sensors are becoming pervasive, which has led to an impressive growth in the amount of physical activity data available and an increasing interest in recognizing which specific activity a user is performing. Moreover, big data and machine learning are now cross-fertilizing each other in an approach called "deep learning", which consists of massive artificial neural networks able to detect complicated patterns from enormous amounts of input data to learn classification models. This work compares various state-of-the-art classification techniques for automatic cross-person activity recognition under different scenarios that vary widely in how much information is available for analysis. We have incorporated deep learning by using Google's TensorFlow framework. The data used in this study were acquired from PAMAP2 (Physical Activity Monitoring in the Ageing Population), a publicly available dataset containing physical activity data. To perform cross-person prediction, we used the leave-one-subject-out (LOSO) cross-validation technique. When working with large training sets, the best classifiers obtain very high average accuracies (e.g., 96% using extra randomized trees). However, when the data volume is drastically reduced (where available data are only 0.001% of the continuous data), deep neural networks performed the best, achieving 60% in overall prediction accuracy. We found that even when working with only approximately 22.67% of the full dataset, we can statistically obtain the same results as when working with the full dataset. This finding enables the design of more energy-efficient devices and facilitates cold starts and big data processing of physical activity records.
众所周知,体育活动是健康生活的关键要素之一。医学文献中描述的体育活动的诸多益处包括减肥以及降低慢性病的风险因素。随着可穿戴设备(如智能手表或体育活动腕带)的最新进展,运动跟踪传感器正变得无处不在,这导致可用的体育活动数据量大幅增长,并且人们越来越有兴趣识别用户正在进行的具体活动。此外,大数据和机器学习现在正以一种称为“深度学习”的方法相互交叉融合,该方法由能够从大量输入数据中检测复杂模式以学习分类模型的大规模人工神经网络组成。这项工作比较了各种先进的分类技术,用于在不同场景下进行自动跨人活动识别,这些场景在可用于分析的信息量方面差异很大。我们通过使用谷歌的TensorFlow框架纳入了深度学习。本研究中使用的数据来自PAMAP2(老年人口身体活动监测),这是一个包含身体活动数据的公开可用数据集。为了进行跨人预测,我们使用了留一受试者出(LOSO)交叉验证技术。在处理大型训练集时,最佳分类器可获得非常高的平均准确率(例如,使用额外随机树时为96%)。然而,当数据量大幅减少(可用数据仅为连续数据的0.001%)时,深度神经网络表现最佳,总体预测准确率达到60%。我们发现,即使仅使用完整数据集的约22.67%,我们在统计上也能获得与使用完整数据集时相同的结果。这一发现有助于设计更节能的设备,并便于体育活动记录的冷启动和大数据处理。