Guinness Robert E
Finnish Geospatial Research Institute, Geodeetinrinne 2, FI-02430 Masala, Finland.
Sensors (Basel). 2015 Apr 28;15(5):9962-85. doi: 10.3390/s150509962.
This paper presents the results of research on the use of smartphone sensors (namely, GPS and accelerometers), geospatial information (points of interest, such as bus stops and train stations) and machine learning (ML) to sense mobility contexts. Our goal is to develop techniques to continuously and automatically detect a smartphone user's mobility activities, including walking, running, driving and using a bus or train, in real-time or near-real-time (<5 s). We investigated a wide range of supervised learning techniques for classification, including decision trees (DT), support vector machines (SVM), naive Bayes classifiers (NB), Bayesian networks (BN), logistic regression (LR), artificial neural networks (ANN) and several instance-based classifiers (KStar, LWLand IBk). Applying ten-fold cross-validation, the best performers in terms of correct classification rate (i.e., recall) were DT (96.5%), BN (90.9%), LWL (95.5%) and KStar (95.6%). In particular, the DT-algorithm RandomForest exhibited the best overall performance. After a feature selection process for a subset of algorithms, the performance was improved slightly. Furthermore, after tuning the parameters of RandomForest, performance improved to above 97.5%. Lastly, we measured the computational complexity of the classifiers, in terms of central processing unit (CPU) time needed for classification, to provide a rough comparison between the algorithms in terms of battery usage requirements. As a result, the classifiers can be ranked from lowest to highest complexity (i.e., computational cost) as follows: SVM, ANN, LR, BN, DT, NB, IBk, LWL and KStar. The instance-based classifiers take considerably more computational time than the non-instance-based classifiers, whereas the slowest non-instance-based classifier (NB) required about five-times the amount of CPU time as the fastest classifier (SVM). The above results suggest that DT algorithms are excellent candidates for detecting mobility contexts in smartphones, both in terms of performance and computational complexity.
本文介绍了利用智能手机传感器(即全球定位系统和加速度计)、地理空间信息(如公交站和火车站等兴趣点)以及机器学习(ML)来感知移动上下文的研究结果。我们的目标是开发能够实时或近实时(<5秒)持续自动检测智能手机用户移动活动的技术,包括步行、跑步、驾驶以及乘坐公交或火车。我们研究了多种用于分类的监督学习技术,包括决策树(DT)、支持向量机(SVM)、朴素贝叶斯分类器(NB)、贝叶斯网络(BN)、逻辑回归(LR)、人工神经网络(ANN)以及几种基于实例的分类器(KStar、LWL和IBk)。应用十折交叉验证,就正确分类率(即召回率)而言,表现最佳的是DT(96.5%)、BN(90.9%)、LWL(95.5%)和KStar(95.6%)。特别是,DT算法随机森林展现出最佳的整体性能。在对部分算法进行特征选择过程后,性能略有提升。此外,在调整随机森林的参数后,性能提升至97.5%以上。最后,我们根据分类所需的中央处理器(CPU)时间来衡量分类器的计算复杂度,以便在算法之间就电池使用需求进行大致比较。结果,分类器按复杂度(即计算成本)从低到高排序如下:SVM、ANN、LR、BN、DT、NB、IBk、LWL和KStar。基于实例的分类器比非基于实例的分类器需要更多的计算时间,而最慢的非基于实例的分类器(NB)所需的CPU时间约为最快的分类器(SVM)的五倍。上述结果表明,无论是在性能还是计算复杂度方面,DT算法都是检测智能手机移动上下文的优秀候选算法。