Computer Science and Engineering, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal, Karnataka, India.
Electronics and Communication Engineering, Karunya Institute of Technology and Sciences, Coimbatore, Tamil Nadu, India.
Front Public Health. 2023 Mar 3;11:1125011. doi: 10.3389/fpubh.2023.1125011. eCollection 2023.
Digital health data collection is vital for healthcare and medical research. But it contains sensitive information about patients, which makes it challenging. To collect health data without privacy breaches, it must be secured between the data owner and the collector. Existing data collection research studies have too stringent assumptions such as using a third-party anonymizer or a private channel amid the data owner and the collector. These studies are more susceptible to privacy attacks due to third-party involvement, which makes them less applicable for privacy-preserving healthcare data collection. This article proposes a novel privacy-preserving data collection protocol that anonymizes healthcare data without using a third-party anonymizer or a private channel for data transmission. A clustering-based -anonymity model was adopted to efficiently prevent identity disclosure attacks, and the communication between the data owner and the collector is restricted to some elected representatives of each equivalent group of data owners. We also identified a privacy attack, known as "leader collusion", in which the elected representatives may collaborate to violate an individual's privacy. We propose solutions for such collisions and sensitive attribute protection. A greedy heuristic method is devised to efficiently handle the data owners who join or depart the anonymization process dynamically. Furthermore, we present the potential privacy attacks on the proposed protocol and theoretical analysis. Extensive experiments are conducted in real-world datasets, and the results suggest that our solution outperforms the state-of-the-art techniques in terms of privacy protection and computational complexity.
数字健康数据收集对于医疗保健和医学研究至关重要。但是,它包含有关患者的敏感信息,这使得数据收集具有挑战性。为了在不侵犯隐私的情况下收集健康数据,必须在数据所有者和收集者之间对其进行保护。现有的数据收集研究过于严格的假设,例如在数据所有者和收集者之间使用第三方匿名器或专用通道。由于涉及第三方,这些研究更容易受到隐私攻击,因此不太适用于保护隐私的医疗保健数据收集。本文提出了一种新颖的隐私保护数据收集协议,该协议无需使用第三方匿名器或专用通道即可对医疗保健数据进行匿名化。采用基于聚类的匿名模型来有效地防止身份泄露攻击,并且将数据所有者和收集者之间的通信限制在每个数据所有者的等价组中的一些选举代表之间。我们还确定了一种称为“领导者勾结”的隐私攻击,其中选举代表可能会合作侵犯个人的隐私。我们针对此类冲突和敏感属性保护提出了解决方案。设计了一种贪婪启发式方法来有效地处理在匿名化过程中动态加入或离开的数据所有者。此外,我们还提出了对所提出协议的潜在隐私攻击以及理论分析。在真实数据集上进行了广泛的实验,结果表明,我们的解决方案在隐私保护和计算复杂度方面均优于最新技术。