• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

An empirical analysis on webservice antipattern prediction in different variants of machine learning perspective.

作者信息

Kumar Lov, Tummalapalli Sahiti, Murthy Lalita Bhanu, Misra Sanjay, Krishna Aneesh

机构信息

NIT Kurukshetra, Kurukshetra, Haryana, India.

BITS Pilani, Hyderabad, India.

出版信息

Sci Rep. 2025 Feb 12;15(1):5183. doi: 10.1038/s41598-025-86454-5.

DOI:10.1038/s41598-025-86454-5
PMID:39939623
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11822131/
Abstract

Anti-patterns are explicit structures in the design that represents a significant violation of software design principles and negatively impacts the software design quality. The presence of these Anti-patterns highly influences the maintainability and perception of software systems. Thus it becomes necessary to predict anti-patterns at the early stage and refactor them to improve the software quality in terms of execution cost, maintenance cost, and memory consumption. In the anti-pattern prediction domain, during research analysis, it was realized that there had been very little work instigated on addressing both class imbalance and feature redundancy problems jointly to enhance models' performance and prediction accuracy. It has been perceived in the literature survey to study droughts with a comprehensive comparative analysis of different sampling and feature selection strategies. To achieve greater precision results and performance, this research constructs a web service anti-pattern prediction model over preprocessed software source code metrics using sampling and feature selection techniques to handle imbalanced data and feature redundancy to gain flawless web service anti-pattern prediction outcomes. Considering the above erudition, we have applied different variants of aggregation measures to find the metrics at the system level. These extracted metrics are used as input, so we have also applied different variants of feature selection techniques to remove irrelevant features and select the best combination of features. After finding important features, we have also applied different variants of data sampling techniques to overcome the problem of class imbalance. Finally, we have used thirty-three different classifiers to find import patterns that help identify anti-patterns. These all techniques are compared using Accuracy and Area Under the ROC (receiver operating characteristic curve) Curve (AUC). The experimental result of web service anti-pattern prediction models validated on 226 WSDL files illustrates that the least square support vector machine (LSSVM) with RBF kernel attains the best performance among the other 33 competing classifiers employed with the lowest Friedman mean rank value of 1.18. During comparative analysis over different feature subset selection techniques, the outcome indicates the mean accuracy value of 88.40% and mean AUC value of 0.88 for the models developed using significant features are higher in comparison to other techniques. The result shows the up-sampling methods (UPSAM) method secured the highest mean accuracy % and mean AUC with values of 86.14% and 0.87, respectively. The experimental result indicates the performance of the web service anti-pattern prediction models is adversely impacted by class imbalance and irrelevance of features. The outcome demonstrates that the performance of trained models improved with an AUC value between 0.805 to 0.99 post-application of sampling and feature selection strategies without using feature selection and sampling techniques. The outcome implies that USMAP achieves better performance. The result demonstrates that the models developed using significant features drive the desired effect compared to other implemented feature selection techniques.

摘要
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d079/11822131/922dd6b7fd52/41598_2025_86454_Fig10_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d079/11822131/7bc034d3399c/41598_2025_86454_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d079/11822131/e0bb37c9a934/41598_2025_86454_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d079/11822131/27b7fc824ef7/41598_2025_86454_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d079/11822131/5daf40950d5a/41598_2025_86454_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d079/11822131/cb58b2e7c4b7/41598_2025_86454_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d079/11822131/7d08f121f2d2/41598_2025_86454_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d079/11822131/d586df4fa127/41598_2025_86454_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d079/11822131/24eec6250119/41598_2025_86454_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d079/11822131/08326c71efb9/41598_2025_86454_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d079/11822131/922dd6b7fd52/41598_2025_86454_Fig10_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d079/11822131/7bc034d3399c/41598_2025_86454_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d079/11822131/e0bb37c9a934/41598_2025_86454_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d079/11822131/27b7fc824ef7/41598_2025_86454_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d079/11822131/5daf40950d5a/41598_2025_86454_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d079/11822131/cb58b2e7c4b7/41598_2025_86454_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d079/11822131/7d08f121f2d2/41598_2025_86454_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d079/11822131/d586df4fa127/41598_2025_86454_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d079/11822131/24eec6250119/41598_2025_86454_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d079/11822131/08326c71efb9/41598_2025_86454_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d079/11822131/922dd6b7fd52/41598_2025_86454_Fig10_HTML.jpg

相似文献

1
An empirical analysis on webservice antipattern prediction in different variants of machine learning perspective.
Sci Rep. 2025 Feb 12;15(1):5183. doi: 10.1038/s41598-025-86454-5.
2
Application of information theoretic feature selection and machine learning methods for the development of genetic risk prediction models.信息论特征选择和机器学习方法在遗传风险预测模型开发中的应用。
Sci Rep. 2021 Dec 2;11(1):23335. doi: 10.1038/s41598-021-00854-x.
3
Machine learning algorithms for outcome prediction in (chemo)radiotherapy: An empirical comparison of classifiers.机器学习算法在(放化疗)治疗结果预测中的应用:分类器的实证比较。
Med Phys. 2018 Jul;45(7):3449-3459. doi: 10.1002/mp.12967. Epub 2018 Jun 13.
4
Combining handcrafted features with latent variables in machine learning for prediction of radiation-induced lung damage.将机器学习中的手工特征与潜在变量相结合,以预测放射性肺损伤。
Med Phys. 2019 May;46(5):2497-2511. doi: 10.1002/mp.13497. Epub 2019 Apr 8.
5
Improved support vector machine classification for imbalanced medical datasets by novel hybrid sampling combining modified mega-trend-diffusion and bagging extreme learning machine model.通过结合改进的大趋势扩散和装袋极限学习机模型的新型混合采样,改进不平衡医学数据集的支持向量机分类。
Math Biosci Eng. 2023 Sep 15;20(10):17672-17701. doi: 10.3934/mbe.2023786.
6
Radiomics analysis for the differentiation of autoimmune pancreatitis and pancreatic ductal adenocarcinoma in F-FDG PET/CT.基于 F-FDG PET/CT 的影像组学分析鉴别自身免疫性胰腺炎和胰腺导管腺癌。
Med Phys. 2019 Oct;46(10):4520-4530. doi: 10.1002/mp.13733. Epub 2019 Aug 13.
7
Prediction of atherosclerosis using machine learning based on operations research.基于运筹学的机器学习预测动脉粥样硬化。
Math Biosci Eng. 2022 Mar 14;19(5):4892-4910. doi: 10.3934/mbe.2022229.
8
A Novel Rank Aggregation-Based Hybrid Multifilter Wrapper Feature Selection Method in Software Defect Prediction.一种新颖的基于排序聚合的混合多过滤器包装特征选择方法在软件缺陷预测中。
Comput Intell Neurosci. 2021 Nov 24;2021:5069016. doi: 10.1155/2021/5069016. eCollection 2021.
9
Next-Generation Radiogenomics Sequencing for Prediction of EGFR and KRAS Mutation Status in NSCLC Patients Using Multimodal Imaging and Machine Learning Algorithms.使用多模态成像和机器学习算法的下一代放射基因组学测序预测非小细胞肺癌患者的EGFR和KRAS突变状态
Mol Imaging Biol. 2020 Aug;22(4):1132-1148. doi: 10.1007/s11307-020-01487-8.
10
Analysis of sampling techniques for imbalanced data: An n = 648 ADNI study.分析不平衡数据的采样技术:一项 n = 648 的 ADNI 研究。
Neuroimage. 2014 Feb 15;87:220-41. doi: 10.1016/j.neuroimage.2013.10.005. Epub 2013 Oct 29.

本文引用的文献

1
Deep Learning: A Comprehensive Overview on Techniques, Taxonomy, Applications and Research Directions.深度学习:关于技术、分类法、应用及研究方向的全面综述
SN Comput Sci. 2021;2(6):420. doi: 10.1007/s42979-021-00815-1. Epub 2021 Aug 18.
2
SVMs modeling for highly imbalanced classification.用于高度不平衡分类的支持向量机建模
IEEE Trans Syst Man Cybern B Cybern. 2009 Feb;39(1):281-8. doi: 10.1109/TSMCB.2008.2002909. Epub 2008 Dec 9.