Dhinakaran D, Gopalakrishnan S, Selvaraj D, Girija M S, Prabaharan G
Department of Computer Science and Engineering, Vel Tech Rangarajan Dr. Sagunthala R&D Institute of Science and Technology, Chennai, India.
Department of Computer Science & Engineering (Data Science), Madanapalle Institute of Technology & Science, Andhra Pradesh, India.
MethodsX. 2025 Apr 17;14:103317. doi: 10.1016/j.mex.2025.103317. eCollection 2025 Jun.
In this era of data-driven decision-making, it is important to securely and efficiently extract knowledge from distributed datasets. However, in outsourced data for tasks like frequent itemset mining, privacy is an important issue. The difficulty is to secure sensitive data while delivering the insights of the data. First, this paper proposes a new multi-cloud approach to preserve privacy, which includes two main components, named the Transaction Hewer and Allocator module and the Facile Hash Algorithm (FHA), in extracting the frequent itemset. All these components work together to protect the privacy of the data, wherever it is, during the transmission phase or the computation phase, even if it is raw data or processed data, on the different distributed cloud platforms. The complexities involved in the mining of frequent itemsets led us to introduce the Apriori with Tid Reduction (ATid) algorithm considering scalability and computational operational improvements to the mining process due to the Tid Reduction concept. We conduct performance evaluation on several datasets and show that our proposed framework achieves higher performance than existing methods, and encryption and decryption processes reduce the computational time by up to 25 % compared to the best alternative. It also exhibits approximately 15 % reduction in communication costs and displays scalability with the growing number of transactions, compared to the state-of-the-art evaluation metrics that indicate improved communication overhead.•Introduces a multi-cloud privacy framework with Facile Hash Algorithm and Transaction Hewer and Allocator.•Enhances scalability using ATid algorithm with Tid Reduction.
在这个数据驱动决策的时代,从分布式数据集中安全、高效地提取知识非常重要。然而,在诸如频繁项集挖掘等外包数据任务中,隐私是一个重要问题。难点在于在提供数据洞察的同时保护敏感数据。首先,本文提出了一种新的多云方法来保护隐私,该方法在提取频繁项集时包括两个主要组件,即事务挖掘器和分配器模块以及简易哈希算法(FHA)。所有这些组件协同工作,以保护数据在传输阶段或计算阶段的隐私,无论数据位于何处,即使是原始数据或已处理数据,在不同的分布式云平台上也是如此。频繁项集挖掘中涉及的复杂性促使我们引入带事务ID约简的Apriori(ATid)算法,考虑到事务ID约简概念对挖掘过程的可扩展性和计算操作的改进。我们在几个数据集上进行了性能评估,结果表明我们提出的框架比现有方法具有更高的性能,并且加密和解密过程与最佳替代方法相比,计算时间最多可减少25%。与表明通信开销有所改善的最新评估指标相比,它还显示出通信成本降低了约15%,并且随着事务数量的增加展现出可扩展性。
• 引入了带有简易哈希算法以及事务挖掘器和分配器的多云隐私框架。
• 使用带事务ID约简的ATid算法增强可扩展性。