Zhang Haitao, Wu Chenxue, Chen Zewei, Liu Zhao, Zhu Yunhong
School of Geographic and Biological Information, Nanjing University of Posts and Telecommunications, Nanjing, Jiangsu, China.
School of Telecommunications and Information Engineering, Nanjing University of Posts and Telecommunications, Nanjing, Jiangsu, China.
PLoS One. 2017 Aug 2;12(8):e0182232. doi: 10.1371/journal.pone.0182232. eCollection 2017.
Analyzing large-scale spatial-temporal k-anonymity datasets recorded in location-based service (LBS) application servers can benefit some LBS applications. However, such analyses can allow adversaries to make inference attacks that cannot be handled by spatial-temporal k-anonymity methods or other methods for protecting sensitive knowledge. In response to this challenge, first we defined a destination location prediction attack model based on privacy-sensitive sequence rules mined from large scale anonymity datasets. Then we proposed a novel on-line spatial-temporal k-anonymity method that can resist such inference attacks. Our anti-attack technique generates new anonymity datasets with awareness of privacy-sensitive sequence rules. The new datasets extend the original sequence database of anonymity datasets to hide the privacy-sensitive rules progressively. The process includes two phases: off-line analysis and on-line application. In the off-line phase, sequence rules are mined from an original sequence database of anonymity datasets, and privacy-sensitive sequence rules are developed by correlating privacy-sensitive spatial regions with spatial grid cells among the sequence rules. In the on-line phase, new anonymity datasets are generated upon LBS requests by adopting specific generalization and avoidance principles to hide the privacy-sensitive sequence rules progressively from the extended sequence anonymity datasets database. We conducted extensive experiments to test the performance of the proposed method, and to explore the influence of the parameter K value. The results demonstrated that our proposed approach is faster and more effective for hiding privacy-sensitive sequence rules in terms of hiding sensitive rules ratios to eliminate inference attacks. Our method also had fewer side effects in terms of generating new sensitive rules ratios than the traditional spatial-temporal k-anonymity method, and had basically the same side effects in terms of non-sensitive rules variation ratios with the traditional spatial-temporal k-anonymity method. Furthermore, we also found the performance variation tendency from the parameter K value, which can help achieve the goal of hiding the maximum number of original sensitive rules while generating a minimum of new sensitive rules and affecting a minimum number of non-sensitive rules.
分析基于位置服务(LBS)应用服务器中记录的大规模时空k匿名数据集,可使一些LBS应用受益。然而,此类分析可能会让对手发动推理攻击,而时空k匿名方法或其他保护敏感知识的方法无法应对这些攻击。针对这一挑战,首先我们基于从大规模匿名数据集中挖掘出的隐私敏感序列规则,定义了一种目的地位置预测攻击模型。然后,我们提出了一种新颖的在线时空k匿名方法,该方法能够抵御此类推理攻击。我们的反攻击技术在考虑隐私敏感序列规则的情况下生成新的匿名数据集。新数据集扩展了匿名数据集的原始序列数据库,以逐步隐藏隐私敏感规则。该过程包括两个阶段:离线分析和在线应用。在离线阶段,从匿名数据集的原始序列数据库中挖掘序列规则,并通过将隐私敏感空间区域与序列规则中的空间网格单元相关联来开发隐私敏感序列规则。在在线阶段,根据LBS请求生成新的匿名数据集,采用特定的泛化和回避原则,从扩展的序列匿名数据集数据库中逐步隐藏隐私敏感序列规则。我们进行了大量实验来测试所提方法的性能,并探究参数K值的影响。结果表明,就隐藏敏感规则比例以消除推理攻击而言,我们提出的方法在隐藏隐私敏感序列规则方面更快、更有效。与传统的时空k匿名方法相比,我们的方法在生成新敏感规则比例方面副作用更少,在非敏感规则变化比例方面与传统时空k匿名方法基本相同。此外,我们还发现了参数K值的性能变化趋势,这有助于实现隐藏最大数量的原始敏感规则、同时生成最少数量的新敏感规则并影响最少数量的非敏感规则这一目标。