Jiang Changnan, Yin Kanglong, Xia Chunhe, Huang Weidong
Key Laboratory of Beijing Network Technology, Beihang University, Beijing 100191, China.
Guangxi Key Lab of Multi-Source Information Mining and Security, Guangxi Normal University, Guilin 541004, China.
Entropy (Basel). 2022 Jul 1;24(7):919. doi: 10.3390/e24070919.
With the popularity of Android and its open source, the Android platform has become an attractive target for hackers, and the detection and classification of malware has become a research hotspot. Existing malware classification methods rely on complex manual operation or large-volume high-quality training data. However, malware data collected by security providers contains user privacy information, such as user identity and behavior habit information. The increasing concern for user privacy poses a challenge to the current malware classification scheme. Based on this problem, we propose a new android malware classification scheme based on Federated learning, named FedHGCDroid, which classifies malware on Android clients in a privacy-protected manner. Firstly, we use a convolutional neural network and graph neural network to design a novel multi-dimensional malware classification model HGCDroid, which can effectively extract malicious behavior features to classify the malware accurately. Secondly, we introduce an FL framework to enable distributed Android clients to collaboratively train a comprehensive Android malware classification model in a privacy-preserving way. Finally, to adapt to the non-IID distribution of malware on Android clients, we propose a contribution degree-based adaptive classifier training mechanism FedAdapt to improve the adaptability of the malware classifier based on Federated learning. Comprehensive experimental studies on the Androzoo dataset (under different non-IID data settings) show that the FedHGCDroid achieves more adaptability and higher accuracy than the other state-of-the-art methods.
随着安卓系统的普及及其开源特性,安卓平台已成为黑客的一个有吸引力的目标,恶意软件的检测和分类也成为了一个研究热点。现有的恶意软件分类方法依赖于复杂的人工操作或大量高质量的训练数据。然而,安全提供商收集的恶意软件数据包含用户隐私信息,如用户身份和行为习惯信息。对用户隐私的日益关注给当前的恶意软件分类方案带来了挑战。基于此问题,我们提出了一种基于联邦学习的新型安卓恶意软件分类方案,名为FedHGCDroid,它以隐私保护的方式在安卓客户端上对恶意软件进行分类。首先,我们使用卷积神经网络和图神经网络设计了一种新颖的多维恶意软件分类模型HGCDroid,可以有效地提取恶意行为特征以准确地对恶意软件进行分类。其次,我们引入了一个联邦学习框架,使分布式安卓客户端能够以隐私保护的方式协同训练一个综合的安卓恶意软件分类模型。最后,为了适应安卓客户端上恶意软件的非独立同分布,我们提出了一种基于贡献度的自适应分类器训练机制FedAdapt,以提高基于联邦学习的恶意软件分类器的适应性。在Androzoo数据集上(在不同的非独立同分布数据设置下)进行的综合实验研究表明,FedHGCDroid比其他现有最先进方法具有更高的适应性和准确率。