Suppr超能文献

基于遗传算法通过事务删除有效隐藏敏感项集

Efficiently hiding sensitive itemsets with transaction deletion based on genetic algorithms.

作者信息

Lin Chun-Wei, Zhang Binbin, Yang Kuo-Tung, Hong Tzung-Pei

机构信息

Innovative Information Industry Research Center (IIIRC), School of Computer Science and Technology, Harbin Institute of Technology Shenzhen Graduate School, Shenzhen 518055, China ; Shenzhen Key Laboratory of Internet Information Collaboration, School of Computer Science and Technology, Harbin Institute of Technology Shenzhen Graduate School, Shenzhen 518055, China.

Medical School, Shenzhen University, Shenzhen 518060, China.

出版信息

ScientificWorldJournal. 2014;2014:398269. doi: 10.1155/2014/398269. Epub 2014 Sep 1.

Abstract

Data mining is used to mine meaningful and useful information or knowledge from a very large database. Some secure or private information can be discovered by data mining techniques, thus resulting in an inherent risk of threats to privacy. Privacy-preserving data mining (PPDM) has thus arisen in recent years to sanitize the original database for hiding sensitive information, which can be concerned as an NP-hard problem in sanitization process. In this paper, a compact prelarge GA-based (cpGA2DT) algorithm to delete transactions for hiding sensitive itemsets is thus proposed. It solves the limitations of the evolutionary process by adopting both the compact GA-based (cGA) mechanism and the prelarge concept. A flexible fitness function with three adjustable weights is thus designed to find the appropriate transactions to be deleted in order to hide sensitive itemsets with minimal side effects of hiding failure, missing cost, and artificial cost. Experiments are conducted to show the performance of the proposed cpGA2DT algorithm compared to the simple GA-based (sGA2DT) algorithm and the greedy approach in terms of execution time and three side effects.

摘要

数据挖掘用于从非常大的数据库中挖掘有意义且有用的信息或知识。通过数据挖掘技术可能会发现一些安全或私密信息,从而带来隐私受到威胁的内在风险。因此,近年来出现了隐私保护数据挖掘(PPDM),对原始数据库进行清理以隐藏敏感信息,这在清理过程中可被视为一个NP难问题。本文因此提出了一种基于紧凑预大遗传算法的(cpGA2DT)算法,用于删除事务以隐藏敏感项集。它通过采用基于紧凑遗传算法(cGA)的机制和预大概念来解决进化过程的局限性。因此设计了一个具有三个可调权重的灵活适应度函数,以找到要删除的合适事务,从而以最小的隐藏失败、缺失成本和人工成本等副作用来隐藏敏感项集。进行实验以展示所提出的cpGA2DT算法与基于简单遗传算法的(sGA2DT)算法和贪婪方法相比在执行时间和三种副作用方面的性能。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4547/4165802/4a6818bfa329/TSWJ2014-398269.001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验