Suppr超能文献

基于智能手机运动和位置传感器的未标记数据的自动标注。

Automatic Annotation of Unlabeled Data from Smartphone-Based Motion and Location Sensors.

机构信息

School of Computer Sciences, Universiti Sains Malaysia, 11800 USM Penang, Malaysia.

出版信息

Sensors (Basel). 2018 Jul 3;18(7):2134. doi: 10.3390/s18072134.

Abstract

Automatic data annotation eliminates most of the challenges we faced due to the manual methods of annotating sensor data. It significantly improves users’ experience during sensing activities since their active involvement in the labeling process is reduced. An unsupervised learning technique such as clustering can be used to automatically annotate sensor data. However, the lingering issue with clustering is the validation of generated clusters. In this paper, we adopted the -means clustering algorithm for annotating unlabeled sensor data for the purpose of detecting sensitive location information of mobile crowd sensing users. Furthermore, we proposed a cluster validation index for the -means algorithm, which is based on Multiple Pair-Frequency. Thereafter, we trained three classifiers (Support Vector Machine, -Nearest Neighbor, and Naïve Bayes) using cluster labels generated from the -means clustering algorithm. The accuracy, precision, and recall of these classifiers were evaluated during the classification of “non-sensitive” and “sensitive” data from motion and location sensors. Very high accuracy scores were recorded from Support Vector Machine and -Nearest Neighbor classifiers while a fairly high accuracy score was recorded from the Naïve Bayes classifier. With the hybridized machine learning (unsupervised and supervised) technique presented in this paper, unlabeled sensor data was automatically annotated and then classified.

摘要

自动数据标注消除了我们在使用手动方法标注传感器数据时面临的大多数挑战。它显著改善了用户在感测活动中的体验,因为他们在标注过程中的主动参与减少了。聚类等无监督学习技术可用于自动标注传感器数据。然而,聚类的一个遗留问题是生成的聚类的验证。在本文中,我们采用了 -means 聚类算法来标注未标记的传感器数据,目的是检测移动人群感应用户的敏感位置信息。此外,我们为 -means 算法提出了一个基于多对频率的聚类验证指标。之后,我们使用 -means 聚类算法生成的聚类标签训练了三个分类器(支持向量机、 -最近邻和朴素贝叶斯)。在对运动和位置传感器的“非敏感”和“敏感”数据进行分类时,评估了这些分类器的准确性、精度和召回率。支持向量机和 -最近邻分类器的准确率得分非常高,而朴素贝叶斯分类器的准确率得分也相当高。本文提出的混合机器学习(无监督和监督)技术可以自动标注未标记的传感器数据,然后对其进行分类。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8570/6069149/3f3af956c086/sensors-18-02134-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验