Suppr超能文献

使用具有自注意力池化、深度序列和基于图的混合特征提取的量子卷积神经网络在Java源代码中进行漏洞检测。

Vulnerability detection in Java source code using a quantum convolutional neural network with self-attentive pooling, deep sequence, and graph-based hybrid feature extraction.

作者信息

Hussain Shumaila, Nadeem Muhammad, Baber Junaid, Hamdi Mohammed, Rajab Adel, Al Reshan Mana Saleh, Shaikh Asadullah

机构信息

Department of Computer Science, Sardar Bahadur Khan Women's University, Quetta, Pakistan.

Department of Computer Science and IT, University of Balochistan, Quetta, Pakistan.

出版信息

Sci Rep. 2024 Mar 28;14(1):7406. doi: 10.1038/s41598-024-56871-z.

Abstract

Software vulnerabilities pose a significant threat to system security, necessitating effective automatic detection methods. Current techniques face challenges such as dependency issues, language bias, and coarse detection granularity. This study presents a novel deep learning-based vulnerability detection system for Java code. Leveraging hybrid feature extraction through graph and sequence-based techniques enhances semantic and syntactic understanding. The system utilizes control flow graphs (CFG), abstract syntax trees (AST), program dependencies (PD), and greedy longest-match first vectorization for graph representation. A hybrid neural network (GCN-RFEMLP) and the pre-trained CodeBERT model extract features, feeding them into a quantum convolutional neural network with self-attentive pooling. The system addresses issues like long-term information dependency and coarse detection granularity, employing intermediate code representation and inter-procedural slice code. To mitigate language bias, a benchmark software assurance reference dataset is employed. Evaluations demonstrate the system's superiority, achieving 99.2% accuracy in detecting vulnerabilities, outperforming benchmark methods. The proposed approach comprehensively addresses vulnerabilities, including improper input validation, missing authorizations, buffer overflow, cross-site scripting, and SQL injection attacks listed by common weakness enumeration (CWE).

摘要

软件漏洞对系统安全构成重大威胁,因此需要有效的自动检测方法。当前的技术面临诸如依赖问题、语言偏差和检测粒度粗糙等挑战。本研究提出了一种新颖的基于深度学习的Java代码漏洞检测系统。通过基于图和序列的技术进行混合特征提取,增强了语义和句法理解。该系统利用控制流图(CFG)、抽象语法树(AST)、程序依赖(PD)以及用于图表示的贪婪最长匹配优先矢量化。一个混合神经网络(GCN-RFEMLP)和预训练的CodeBERT模型提取特征,将其输入到具有自注意力池化的量子卷积神经网络中。该系统通过采用中间代码表示和过程间切片代码来解决长期信息依赖和检测粒度粗糙等问题。为了减轻语言偏差,使用了一个基准软件保证参考数据集。评估证明了该系统的优越性,在检测漏洞方面达到了99.2%的准确率,优于基准方法。所提出的方法全面解决了漏洞问题,包括常见弱点枚举(CWE)列出的输入验证不当、授权缺失、缓冲区溢出、跨站脚本攻击和SQL注入攻击等。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/16ff/10978945/59e8818fd73e/41598_2024_56871_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验