Sun She, Ma Shuai, Song Jing-He, Yue Wen-Hai, Lin Xue-Lian, Ma Tiejun
State Key Laboratory of Software Development Environment, School of Computer Science and Engineering, Beihang University, Beijing, 100191 China.
Department of Decision Analytics and Risk, Southampton Business School, University of Southampton, Southampton, SO17 1BJ UK.
J Comput Sci Technol. 2022;37(5):1026-1048. doi: 10.1007/s11390-022-2409-x. Epub 2022 Sep 30.
With the advancing of location-detection technologies and the increasing popularity of mobile phones and other location-aware devices, trajectory data is continuously growing. While large-scale trajectories provide opportunities for various applications, the locations in trajectories pose a threat to individual privacy. Recently, there has been an interesting debate on the reidentifiability of individuals in the Science magazine. The main finding of Sánchez is exactly opposite to that of De Montjoye , which raises the first question: "what is the true situation of the privacy preservation for trajectories in terms of reidentification?" Furthermore, it is known that anonymization typically causes a decline of data utility, and anonymization mechanisms need to consider the trade-off between privacy and utility. This raises the second question: "what is the true situation of the utility of anonymized trajectories?" To answer these two questions, we conduct a systematic experimental study, using three real-life trajectory datasets, five existing anonymization mechanisms (i.e., identifier anonymization, grid-based anonymization, dummy trajectories, -anonymity and -differential privacy), and two practical applications (i.e., travel time estimation and window range queries). Our findings reveal the true situation of the privacy preservation for trajectories in terms of reidentification and the true situation of the utility of anonymized trajectories, and essentially close the debate between De Montjoye and Sánchez To the best of our knowledge, this study is among the first systematic evaluation and analysis of anonymized trajectories on the individual privacy in terms of unicity and on the utility in terms of practical applications.
The online version contains supplementary material available at 10.1007/s11390-022-2409-x.
随着位置检测技术的发展以及手机和其他位置感知设备的日益普及,轨迹数据持续增长。虽然大规模轨迹为各种应用提供了机会,但轨迹中的位置对个人隐私构成了威胁。最近,《科学》杂志上就个人的可再识别性展开了一场有趣的辩论。桑切斯的主要发现与德蒙乔伊的完全相反,这就引出了第一个问题:“就再识别而言,轨迹隐私保护的真实情况是怎样的?”此外,众所周知,匿名化通常会导致数据效用下降,并且匿名化机制需要考虑隐私和效用之间的权衡。这就引出了第二个问题:“匿名化轨迹的效用真实情况是怎样的?”为了回答这两个问题,我们进行了一项系统的实验研究,使用了三个真实生活轨迹数据集、五种现有的匿名化机制(即标识符匿名化、基于网格的匿名化、虚拟轨迹、 -匿名性和 -差分隐私)以及两个实际应用(即出行时间估计和窗口范围查询)。我们的研究结果揭示了就再识别而言轨迹隐私保护的真实情况以及匿名化轨迹效用的真实情况,并且在本质上结束了德蒙乔伊和桑切斯之间的辩论。据我们所知,本研究是首批就唯一性方面对匿名化轨迹的个人隐私以及实际应用方面的效用进行系统评估和分析的研究之一。
在线版本包含可在10.1007/s11390 - 022 - 2409 - x获取的补充材料。