基于 GCN 的函数调用图中节点特征差异增强的安卓恶意软件检测方法。

An Android Malware Detection Approach to Enhance Node Feature Differences in a Function Call Graph Based on GCNs.

机构信息

School of Software, Xinjiang University, Urumqi 830091, China.

College of Information Science and Engineering, Xinjiang University, Urumqi 830046, China.

出版信息

Sensors (Basel). 2023 May 13;23(10):4729. doi: 10.3390/s23104729.

DOI:10.3390/s23104729

PMID:37430643

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10224091/

Abstract

The smartphone has become an indispensable tool in our daily lives, and the Android operating system is widely installed on our smartphones. This makes Android smartphones a prime target for malware. In order to address threats posed by malware, many researchers have proposed different malware detection approaches, including using a function call graph (FCG). Although an FCG can capture the complete call-callee semantic relationship of a function, it will be represented as a huge graph structure. The presence of many nonsensical nodes affects the detection efficiency. At the same time, the characteristics of the graph neural networks (GNNs) make the important node features in the FCG tend toward similar nonsensical node features during the propagation process. In our work, we propose an Android malware detection approach to enhance node feature differences in an FCG. Firstly, we propose an API-based node feature by which we can visually analyze the behavioral properties of different functions in the app and determine whether their behavior is benign or malicious. Then, we extract the FCG and the features of each function from the decompiled APK file. Next, we calculate the API coefficient inspired by the idea of the TF-IDF algorithm and extract the sensitive function called subgraph (S-FCSG) based on API coefficient ranking. Finally, before feeding the S-FCSG and node features into the GCN model, we add the self-loop for each node of the S-FCSG. A 1-D convolutional neural network and fully connected layers are used for further feature extraction and classification, respectively. The experimental result shows that our approach enhances the node feature differences in an FCG, and the detection accuracy is greater than that of models using other features, suggesting that malware detection based on a graph structure and GNNs has a lot of space for future study.

摘要

智能手机已成为我们日常生活中不可或缺的工具，而 Android 操作系统广泛安装在我们的智能手机上。这使得 Android 智能手机成为恶意软件的主要目标。为了解决恶意软件带来的威胁，许多研究人员提出了不同的恶意软件检测方法，包括使用函数调用图 (FCG)。虽然 FCG 可以捕获函数的完整调用-被调用者语义关系，但它将表示为巨大的图形结构。存在许多无意义的节点会影响检测效率。同时，图神经网络 (GNN) 的特性使得 FCG 中的重要节点特征在传播过程中趋向于相似的无意义节点特征。在我们的工作中，我们提出了一种 Android 恶意软件检测方法，以增强 FCG 中的节点特征差异。首先，我们提出了一种基于 API 的节点特征，通过该特征我们可以直观地分析应用程序中不同函数的行为特性，并确定其行为是良性还是恶意。然后，我们从反编译的 APK 文件中提取 FCG 和每个函数的特征。接下来，我们受 TF-IDF 算法思想的启发计算 API 系数，并基于 API 系数排名提取敏感函数调用子图 (S-FCSG)。最后，在将 S-FCSG 和节点特征输入 GCN 模型之前，我们为 S-FCSG 的每个节点添加自环。使用 1-D 卷积神经网络和全连接层分别进行进一步的特征提取和分类。实验结果表明，我们的方法增强了 FCG 中的节点特征差异，检测精度大于使用其他特征的模型，表明基于图结构和 GNN 的恶意软件检测有很大的研究空间。