Suppr超能文献

MLR预测器:一个用于多标签需求分类的通用且高效的计算框架。

MLR-predictor: a versatile and efficient computational framework for multi-label requirements classification.

作者信息

Saleem Summra, Asim Muhammad Nabeel, Van Elst Ludger, Junker Markus, Dengel Andreas

机构信息

Department of Computer Science, Rheinland Pfälzische Technische Universität, Kaiserslautern, Germany.

German Research Center for Artificial Intelligence (DFKI), Kaiserslautern, Germany.

出版信息

Front Artif Intell. 2024 Nov 27;7:1481581. doi: 10.3389/frai.2024.1481581. eCollection 2024.

Abstract

INTRODUCTION

Requirements classification is an essential task for development of a successful software by incorporating all relevant aspects of users' needs. Additionally, it aids in the identification of project failure risks and facilitates to achieve project milestones in more comprehensive way. Several machine learning predictors are developed for binary or multi-class requirements classification. However, a few predictors are designed for multi-label classification and they are not practically useful due to less predictive performance.

METHOD

MLR-Predictor makes use of innovative OkapiBM25 model to transforms requirements text into statistical vectors by computing words informative patterns. Moreover, predictor transforms multi-label requirements classification data into multi-class classification problem and utilize logistic regression classifier for categorization of requirements. The performance of the proposed predictor is evaluated and compared with 123 machine learning and 9 deep learning-based predictive pipelines across three public benchmark requirements classification datasets using eight different evaluation measures.

RESULTS

The large-scale experimental results demonstrate that proposed MLR-Predictor outperforms 123 adopted machine learning and 9 deep learning predictive pipelines, as well as the state-of-the-art requirements classification predictor. Specifically, in comparison to state-of-the-art predictor, it achieves a 13% improvement in macro F1-measure on the PROMISE dataset, a 1% improvement on the EHR-binary dataset, and a 2.5% improvement on the EHR-multiclass dataset.

DISCUSSION

As a case study, the generalizability of proposed predictor is evaluated on softwares customer reviews classification data. In this context, the proposed predictor outperformed the state-of-the-art BERT language model by F-1 score of 1.4%. These findings underscore the robustness and effectiveness of the proposed MLR-Predictor in various contexts, establishing its utility as a promising solution for requirements classification task.

摘要

引言

需求分类是开发成功软件的一项重要任务,它纳入了用户需求的所有相关方面。此外,它有助于识别项目失败风险,并有助于以更全面的方式实现项目里程碑。已经开发了几种用于二分类或多分类需求分类的机器学习预测器。然而,为多标签分类设计的预测器很少,并且由于预测性能较低,它们在实际应用中并不实用。

方法

MLR预测器利用创新的OkapiBM25模型,通过计算单词信息模式将需求文本转换为统计向量。此外,预测器将多标签需求分类数据转换为多分类问题,并使用逻辑回归分类器对需求进行分类。使用八种不同评估指标,在三个公共基准需求分类数据集上,对所提出预测器的性能进行评估,并与123个机器学习和9个基于深度学习的预测管道进行比较。

结果

大规模实验结果表明,所提出的MLR预测器优于123个采用的机器学习和9个深度学习预测管道,以及最先进的需求分类预测器。具体而言,与最先进的预测器相比,它在PROMISE数据集上宏F1值提高了13%,在EHR二分类数据集上提高了1%,在EHR多分类数据集上提高了2.5%。

讨论

作为一个案例研究,在所提出预测器在软件客户评论分类数据上评估其通用性进行评估。在这种情况下,所提出的预测器在F-1分数上比最先进的BERT语言模型高出1.4%。这些发现强调了所提出MLR预测器在各种情况下的稳健性和有效性,确立了其作为需求分类任务的有前途解决方案的效用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2153/11632133/d1a4d1a7f77e/frai-07-1481581-g0001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验