Liu Lan, Hui Zhanfa, Chen Guiming, Cai Tingfeng, Zhou Chiyu
School of Electronic and Information Engineering, Guangdong Polytechnic Normal University, Guangzhou, 510655, Guangdong, China.
Sci Rep. 2025 Aug 1;15(1):28176. doi: 10.1038/s41598-025-10650-6.
Accurate vulnerability prediction is crucial for identifying potential security risks in software, especially in the context of imbalanced and complex real-world datasets. Traditional methods, such as single-task learning and ensemble approaches, often struggle with these challenges, particularly in detecting rare but critical vulnerabilities. To address this, we propose the MTLPT: Multi-Task Learning with Position Encoding and Lightweight Transformer for Vulnerability Prediction, a novel multi-task learning framework that leverages custom lightweight Transformer blocks and position encoding layers to effectively capture long-range dependencies and complex patterns in source code. The MTLPT model improves sensitivity to rare vulnerabilities and incorporates a dynamic weight loss function to adjust for imbalanced data. Our experiments on real-world vulnerability datasets demonstrate that MTLPT outperforms traditional methods in key performance metrics such as recall, F1-score, AUC, and MCC. Ablation studies further validate the contributions of the lightweight Transformer blocks, position encoding layers, and dynamic weight loss function, confirming their role in enhancing the model's predictive accuracy and efficiency.
准确的漏洞预测对于识别软件中的潜在安全风险至关重要,尤其是在不平衡且复杂的现实世界数据集的背景下。传统方法,如单任务学习和集成方法,往往难以应对这些挑战,特别是在检测罕见但关键的漏洞方面。为了解决这个问题,我们提出了MTLPT:用于漏洞预测的带位置编码和轻量级Transformer的多任务学习,这是一种新颖的多任务学习框架,它利用自定义的轻量级Transformer模块和位置编码层来有效捕获源代码中的长距离依赖关系和复杂模式。MTLPT模型提高了对罕见漏洞的敏感度,并纳入了动态加权损失函数以适应不平衡数据。我们在现实世界漏洞数据集上的实验表明,MTLPT在召回率、F1分数、AUC和MCC等关键性能指标上优于传统方法。消融研究进一步验证了轻量级Transformer模块、位置编码层和动态加权损失函数的贡献,证实了它们在提高模型预测准确性和效率方面的作用。