Suppr超能文献

多级高实用项集隐藏

Multi-level high utility-itemset hiding.

作者信息

Nguyen Loan T T, Duong Hoa, Mai An, Vo Bay

机构信息

School of Computer Science and Engineering, International University, Ho Chi Minh City, Vietnam.

Vietnam National University, Ho Chi Minh City, Vietnam.

出版信息

PLoS One. 2025 Feb 3;20(2):e0317427. doi: 10.1371/journal.pone.0317427. eCollection 2025.

Abstract

Privacy is as a critical issue in the age of data. Organizations and corporations who publicly share their data always have a major concern that their sensitive information may be leaked or extracted by rivals or attackers using data miners. High-utility itemset mining (HUIM) is an extension to frequent itemset mining (FIM) which deals with business data in the form of transaction databases, data that is also in danger of being stolen. To deal with this, a number of privacy-preserving data mining (PPDM) techniques have been introduced. An important topic in PPDM in the recent years is privacy-preserving utility mining (PPUM). The goal of PPUM is to protect the sensitive information, such as sensitive high-utility itemsets, in transaction databases, and make them undiscoverable for data mining techniques. However, available PPUM methods do not consider the generalization of items in databases (categories, classes, groups, etc.). These algorithms only consider the items at a specialized level, leaving the item combinations at a higher level vulnerable to attacks. The insights gained from higher abstraction levels are somewhat more valuable than those from lower levels since they contain the outlines of the data. To address this issue, this work suggests two PPUM algorithms, namely MLHProtector and FMLHProtector, to operate at all abstraction levels in a transaction database to protect them from data mining algorithms. Empirical experiments showed that both algorithms successfully protect the itemsets from being compromised by attackers.

摘要

在数据时代,隐私是一个至关重要的问题。公开共享数据的组织和公司一直主要担心其敏感信息可能会被竞争对手或攻击者利用数据挖掘工具泄露或提取。高实用性项集挖掘(HUIM)是频繁项集挖掘(FIM)的扩展,它处理以事务数据库形式存在的商业数据,而这种数据也面临被盗的风险。为了解决这个问题,已经引入了一些隐私保护数据挖掘(PPDM)技术。近年来,PPDM中的一个重要主题是隐私保护效用挖掘(PPUM)。PPUM的目标是保护事务数据库中的敏感信息,如敏感高实用性项集,并使数据挖掘技术无法发现它们。然而,现有的PPUM方法没有考虑数据库中项的泛化(类别、类、组等)。这些算法只考虑特定级别的项,使得更高级别的项组合容易受到攻击。从更高抽象级别获得的见解比从较低级别获得的见解更有价值,因为它们包含了数据的轮廓。为了解决这个问题,这项工作提出了两种PPUM算法,即MLHProtector和FMLHProtector,它们可以在事务数据库的所有抽象级别上运行,以保护它们免受数据挖掘算法的攻击。实证实验表明,这两种算法都成功地保护了项集不被攻击者破坏。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/af0f/11790145/ff0fdd36cef2/pone.0317427.g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验