Department of Computer Science, Aligarh Muslim University, Aligarh, India.
Department of Computer Science, College of Engineering and Computer Science, Jazan University, Jazan, Saudi Arabia.
PLoS One. 2024 Jul 11;19(7):e0307112. doi: 10.1371/journal.pone.0307112. eCollection 2024.
Maintaining quality in software development projects is becoming very difficult because the complexity of modules in the software is growing exponentially. Software defects are the primary concern, and software defect prediction (SDP) plays a crucial role in detecting faulty modules early and planning effective testing to reduce maintenance costs. However, SDP faces challenges like imbalanced data, high-dimensional features, model overfitting, and outliers. Moreover, traditional SDP models lack transparency and interpretability, which impacts stakeholder confidence in the Software Development Life Cycle (SDLC). We propose SPAM-XAI, a hybrid model integrating novel sampling, feature selection, and eXplainable-AI (XAI) algorithms to address these challenges. The SPAM-XAI model reduces features, optimizes the model, and reduces time and space complexity, enhancing its robustness. The SPAM-XAI model exhibited improved performance after experimenting with the NASA PROMISE repository's datasets. It achieved an accuracy of 98.13% on CM1, 96.00% on PC1, and 98.65% on PC2, surpassing previous state-of-the-art and baseline models with other evaluation matrices enhancement compared to existing methods. The SPAM-XAI model increases transparency and facilitates understanding of the interaction between features and error status, enabling coherent and comprehensible predictions. This enhancement optimizes the decision-making process and enhances the model's trustworthiness in the SDLC.
在软件开发项目中保持质量变得非常困难,因为软件模块的复杂性呈指数级增长。软件缺陷是主要关注点,软件缺陷预测 (SDP) 在早期检测有缺陷的模块和规划有效的测试以降低维护成本方面起着至关重要的作用。然而,SDP 面临着数据不平衡、高维特征、模型过拟合和异常值等挑战。此外,传统的 SDP 模型缺乏透明度和可解释性,这影响了利益相关者对软件开发生命周期 (SDLC) 的信心。我们提出了 SPAM-XAI,这是一种集成新颖采样、特征选择和可解释人工智能 (XAI) 算法的混合模型,以解决这些挑战。SPAM-XAI 模型减少了特征,优化了模型,降低了时间和空间复杂度,提高了其鲁棒性。SPAM-XAI 模型在对 NASA PROMISE 存储库的数据集进行实验后表现出了改进的性能。它在 CM1 上的准确率达到了 98.13%,在 PC1 上的准确率达到了 96.00%,在 PC2 上的准确率达到了 98.65%,超过了以前的最先进模型和基线模型,并且在其他评估矩阵方面的增强与现有方法相比。SPAM-XAI 模型提高了透明度,并促进了对特征和错误状态之间相互作用的理解,从而实现了连贯和可理解的预测。这种增强优化了决策过程,并提高了模型在 SDLC 中的可信度。