Suppr超能文献

用于跨语言代码异味检测的动态堆叠集成方法。

Dynamic stacking ensemble for cross-language code smell detection.

作者信息

Aljamaan Hamoud

机构信息

Information and Computer Science Department, King Fahd University of Petroleum and Minerals, Dhahran, Saudi Arabia.

Interdisciplinary Research Center for Finance and Digital Economy, King Fahd University of Petroleum and Minerals, Dhahran, Saudi Arabia.

出版信息

PeerJ Comput Sci. 2024 Aug 15;10:e2254. doi: 10.7717/peerj-cs.2254. eCollection 2024.

Abstract

Code smells refer to poor design and implementation choices by software engineers that might affect the overall software quality. Code smells detection using machine learning models has become a popular area to build effective models that are capable of detecting different code smells in multiple programming languages. However, the process of building of such effective models has not reached a state of stability, and most of the existing research focuses on Java code smells detection. The main objective of this article is to propose dynamic ensembles using two strategies, namely greedy search and backward elimination, which are capable of accurately detecting code smells in two programming languages (., Java and Python), and which are less complex than full stacking ensembles. The detection performance of dynamic ensembles were investigated within the context of four Java and two Python code smells. The greedy search and backward elimination strategies yielded different base models lists to build dynamic ensembles. In comparison to full stacking ensembles, dynamic ensembles yielded less complex models when they were used to detect most of the investigated Java and Python code smells, with the backward elimination strategy resulting in less complex models. Dynamic ensembles were able to perform comparably against full stacking ensembles with no significant detection loss. This article concludes that dynamic stacking ensembles were able to facilitate the effective and stable detection performance of Java and Python code smells over all base models and with less complexity than full stacking ensembles.

摘要

代码异味是指软件工程师做出的可能影响软件整体质量的糟糕设计和实现选择。使用机器学习模型进行代码异味检测已成为构建能够检测多种编程语言中不同代码异味的有效模型的热门领域。然而,构建此类有效模型的过程尚未达到稳定状态,并且现有的大多数研究都集中在Java代码异味检测上。本文的主要目标是提出使用两种策略的动态集成方法,即贪心搜索和反向消除,它们能够准确检测两种编程语言(即Java和Python)中的代码异味,并且比完全堆叠集成方法更简单。在四种Java代码异味和两种Python代码异味的背景下研究了动态集成方法的检测性能。贪心搜索和反向消除策略产生了不同的基础模型列表来构建动态集成。与完全堆叠集成相比,当使用动态集成来检测大多数研究的Java和Python代码异味时,它们产生的模型更简单,反向消除策略产生的模型更简单。动态集成能够与完全堆叠集成相媲美,且检测损失不显著。本文得出结论,动态堆叠集成能够在所有基础模型上促进对Java和Python代码异味的有效且稳定检测,并且比完全堆叠集成更简单。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0464/11419637/eabb9ac3792d/peerj-cs-10-2254-g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验