Aljamaan Hamoud
Information and Computer Science Department, King Fahd University of Petroleum and Minerals, Dhahran, Saudi Arabia.
Interdisciplinary Research Center for Finance and Digital Economy, King Fahd University of Petroleum and Minerals, Dhahran, Saudi Arabia.
PeerJ Comput Sci. 2024 Aug 15;10:e2254. doi: 10.7717/peerj-cs.2254. eCollection 2024.
Code smells refer to poor design and implementation choices by software engineers that might affect the overall software quality. Code smells detection using machine learning models has become a popular area to build effective models that are capable of detecting different code smells in multiple programming languages. However, the process of building of such effective models has not reached a state of stability, and most of the existing research focuses on Java code smells detection. The main objective of this article is to propose dynamic ensembles using two strategies, namely greedy search and backward elimination, which are capable of accurately detecting code smells in two programming languages (., Java and Python), and which are less complex than full stacking ensembles. The detection performance of dynamic ensembles were investigated within the context of four Java and two Python code smells. The greedy search and backward elimination strategies yielded different base models lists to build dynamic ensembles. In comparison to full stacking ensembles, dynamic ensembles yielded less complex models when they were used to detect most of the investigated Java and Python code smells, with the backward elimination strategy resulting in less complex models. Dynamic ensembles were able to perform comparably against full stacking ensembles with no significant detection loss. This article concludes that dynamic stacking ensembles were able to facilitate the effective and stable detection performance of Java and Python code smells over all base models and with less complexity than full stacking ensembles.
代码异味是指软件工程师做出的可能影响软件整体质量的糟糕设计和实现选择。使用机器学习模型进行代码异味检测已成为构建能够检测多种编程语言中不同代码异味的有效模型的热门领域。然而,构建此类有效模型的过程尚未达到稳定状态,并且现有的大多数研究都集中在Java代码异味检测上。本文的主要目标是提出使用两种策略的动态集成方法,即贪心搜索和反向消除,它们能够准确检测两种编程语言(即Java和Python)中的代码异味,并且比完全堆叠集成方法更简单。在四种Java代码异味和两种Python代码异味的背景下研究了动态集成方法的检测性能。贪心搜索和反向消除策略产生了不同的基础模型列表来构建动态集成。与完全堆叠集成相比,当使用动态集成来检测大多数研究的Java和Python代码异味时,它们产生的模型更简单,反向消除策略产生的模型更简单。动态集成能够与完全堆叠集成相媲美,且检测损失不显著。本文得出结论,动态堆叠集成能够在所有基础模型上促进对Java和Python代码异味的有效且稳定检测,并且比完全堆叠集成更简单。