Department of Electrical and Computer Engineering, University of California San Diego , La Jolla, CA , USA.
Department of Family and Preventive Medicine, University of California San Diego , La Jolla, CA , USA.
Front Public Health. 2014 Apr 22;2:36. doi: 10.3389/fpubh.2014.00036. eCollection 2014.
Active travel is an important area in physical activity research, but objective measurement of active travel is still difficult. Automated methods to measure travel behaviors will improve research in this area. In this paper, we present a supervised machine learning method for transportation mode prediction from global positioning system (GPS) and accelerometer data.
We collected a dataset of about 150 h of GPS and accelerometer data from two research assistants following a protocol of prescribed trips consisting of five activities: bicycling, riding in a vehicle, walking, sitting, and standing. We extracted 49 features from 1-min windows of this data. We compared the performance of several machine learning algorithms and chose a random forest algorithm to classify the transportation mode. We used a moving average output filter to smooth the output predictions over time.
The random forest algorithm achieved 89.8% cross-validated accuracy on this dataset. Adding the moving average filter to smooth output predictions increased the cross-validated accuracy to 91.9%.
Machine learning methods are a viable approach for automating measurement of active travel, particularly for measuring travel activities that traditional accelerometer data processing methods misclassify, such as bicycling and vehicle travel.
积极出行是体力活动研究中的一个重要领域,但积极出行的客观测量仍然具有挑战性。自动化的出行行为测量方法将改善该领域的研究。本文提出了一种基于全球定位系统(GPS)和加速度计数据的监督机器学习方法,用于预测交通方式。
我们从两名研究助理按照包含五项活动(骑自行车、乘车、步行、坐、站)的预定行程协议,收集了大约 150 小时的 GPS 和加速度计数据。我们从这些数据的 1 分钟窗口中提取了 49 个特征。我们比较了几种机器学习算法的性能,并选择了随机森林算法来对交通方式进行分类。我们使用移动平均输出滤波器对输出预测进行平滑处理,以随时间推移改善输出预测的准确性。
随机森林算法在这个数据集上的交叉验证准确率为 89.8%。添加移动平均滤波器来平滑输出预测可将交叉验证准确率提高到 91.9%。
机器学习方法是自动测量积极出行的可行方法,特别是对于传统加速度计数据处理方法错误分类的出行活动,如骑自行车和乘车出行。