Mao Yingchi, Zhong Haishi, Qi Hai, Ping Ping, Li Xiaofang
College of Computer and Information, Hohai University, Nanjing 210098, China.
School of Computer Information & Engineering, Changzhou Institute of Technology, Changzhou 213032, China.
Sensors (Basel). 2017 Sep 2;17(9):2013. doi: 10.3390/s17092013.
Clustering analysis is one of the most important issues in trajectory data mining. Trajectory clustering can be widely applied in the detection of hotspots, mobile pattern analysis, urban transportation control, and hurricane prediction, etc. To obtain good clustering performance, the existing trajectory clustering approaches need to input one or more parameters to calibrate the optimal values, which results in a heavy workload and computational complexity. To realize adaptive parameter calibration and reduce the workload of trajectory clustering, an adaptive trajectory clustering approach based on the grid and density (ATCGD) is proposed in this paper. The proposed ATCGD approach includes three parts: partition, mapping, and clustering. In the partition phase, ATCGD applies the average angular difference-based MDL (AD-MDL) partition method to ensure the partition accuracy on the premise that it decreases the number of the segments after the partition. During the mapping procedure, the partitioned segments are mapped into the corresponding cells, and the mapping relationship between the segments and the cells are stored. In the clustering phase, adopting the DBSCAN-based method, the segments in the cells are clustered on the basis of the calibrated values of parameters from the mapping procedure. The extensive experiments indicate that although the results of the adaptive parameter calibration are not optimal, in most cases, the difference between the adaptive calibration and the optimal is less than 5%, while the run time of clustering can reduce about 95%, compared with the TRACLUS algorithm.
聚类分析是轨迹数据挖掘中最重要的问题之一。轨迹聚类可广泛应用于热点检测、移动模式分析、城市交通控制和飓风预测等领域。为了获得良好的聚类性能,现有的轨迹聚类方法需要输入一个或多个参数来校准最优值,这导致了繁重的工作量和计算复杂度。为了实现自适应参数校准并减少轨迹聚类的工作量,本文提出了一种基于网格和密度的自适应轨迹聚类方法(ATCGD)。所提出的ATCGD方法包括三个部分:划分、映射和聚类。在划分阶段,ATCGD应用基于平均角度差的最小描述长度(AD-MDL)划分方法,在减少划分后段数的前提下确保划分精度。在映射过程中,将划分后的段映射到相应的单元格中,并存储段与单元格之间的映射关系。在聚类阶段,采用基于DBSCAN的方法,根据映射过程中参数的校准值对单元格中的段进行聚类。大量实验表明,虽然自适应参数校准的结果不是最优的,但在大多数情况下,自适应校准与最优值之间的差异小于5%,而与TRACLUS算法相比,聚类运行时间可减少约95%。