• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

静态试验片状剥落预测:我们能走多远?

Static test flakiness prediction: How Far Can We Go?

作者信息

Pontillo Valeria, Palomba Fabio, Ferrucci Filomena

机构信息

Software Engineering (SeSa) Lab - Department of Computer Science, University of Salerno, Fisciano, Italy.

出版信息

Empir Softw Eng. 2022;27(7):187. doi: 10.1007/s10664-022-10227-1. Epub 2022 Oct 1.

DOI:10.1007/s10664-022-10227-1
PMID:36199835
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9526694/
Abstract

Test flakiness is a phenomenon occurring when a test case is non-deterministic and exhibits both a passing and failing behavior when run against the same code. Over the last years, the problem has been closely investigated by researchers and practitioners, who all have shown its relevance in practice. The software engineering research community has been working toward defining approaches for detecting and addressing test flakiness. Despite being quite accurate, most of these approaches rely on expensive dynamic steps, e.g., the computation of code coverage information. Consequently, they might suffer from scalability issues that possibly preclude their practical use. This limitation has been recently targeted through machine learning solutions that could predict the flakiness of tests using various features, like source code vocabulary or a mixture of static and dynamic metrics computed on individual snapshots of the system. In this paper, we aim to perform a step forward and predict test flakiness . We propose a large-scale experiment on 70 Java projects coming from the iDFlakies and FlakeFlagger datasets. First, we statistically assess the differences between flaky and non-flaky tests in terms of 25 test and production code metrics and smells, analyzing both their individual and combined effects. Based on the results achieved, we experiment with a machine learning approach that predicts test flakiness solely based on static features, comparing it with two state-of-the-art approaches. The key results of the study show that the static approach has performance comparable to those of the baselines. In addition, we found that the characteristics of the production code might impact the performance of the flaky test prediction models.

摘要

测试不稳定是指当一个测试用例是非确定性的,并且在针对相同代码运行时表现出通过和失败两种行为的现象。在过去几年中,研究人员和从业人员对该问题进行了深入研究,他们都证明了其在实践中的相关性。软件工程研究界一直在努力定义检测和解决测试不稳定的方法。尽管其中大多数方法相当准确,但它们大多依赖于昂贵的动态步骤,例如代码覆盖信息的计算。因此,它们可能会遇到可扩展性问题,这可能会妨碍它们的实际应用。最近,通过机器学习解决方案针对这一局限性进行了研究,这些解决方案可以使用各种特征(如源代码词汇或根据系统的各个快照计算的静态和动态指标的混合)来预测测试的不稳定情况。在本文中,我们旨在向前迈进一步并预测测试的不稳定情况。我们对来自iDFlakies和FlakeFlagger数据集的70个Java项目进行了大规模实验。首先,我们根据25个测试和生产代码指标及代码坏味道,从统计学角度评估不稳定测试和非不稳定测试之间的差异,分析它们的个体和综合影响。基于所取得的结果,我们尝试一种仅基于静态特征预测测试不稳定情况的机器学习方法,并将其与两种最先进的方法进行比较。该研究的关键结果表明,静态方法的性能与基线方法相当。此外,我们发现生产代码的特征可能会影响不稳定测试预测模型的性能。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8362/9526694/6f08eb106494/10664_2022_10227_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8362/9526694/75aded6e04aa/10664_2022_10227_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8362/9526694/42badec422b3/10664_2022_10227_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8362/9526694/451520df1328/10664_2022_10227_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8362/9526694/6f08eb106494/10664_2022_10227_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8362/9526694/75aded6e04aa/10664_2022_10227_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8362/9526694/42badec422b3/10664_2022_10227_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8362/9526694/451520df1328/10664_2022_10227_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8362/9526694/6f08eb106494/10664_2022_10227_Fig4_HTML.jpg

相似文献

1
Static test flakiness prediction: How Far Can We Go?静态试验片状剥落预测:我们能走多远?
Empir Softw Eng. 2022;27(7):187. doi: 10.1007/s10664-022-10227-1. Epub 2022 Oct 1.
2
On the adequacy of static analysis warnings with respect to code smell prediction.关于静态分析警告在代码异味预测方面的充分性。
Empir Softw Eng. 2022;27(3):64. doi: 10.1007/s10664-022-10126-5. Epub 2022 Mar 17.
3
Influence of Basalt Aggregate Crushing Technology on Its Geometrical Properties-Preliminary Studies.玄武岩集料破碎技术对其几何性能的影响——初步研究
Materials (Basel). 2023 Jan 8;16(2):602. doi: 10.3390/ma16020602.
4
Dynamic stacking ensemble for cross-language code smell detection.用于跨语言代码异味检测的动态堆叠集成方法。
PeerJ Comput Sci. 2024 Aug 15;10:e2254. doi: 10.7717/peerj-cs.2254. eCollection 2024.
5
The future of Cochrane Neonatal.考克兰新生儿协作网的未来。
Early Hum Dev. 2020 Nov;150:105191. doi: 10.1016/j.earlhumdev.2020.105191. Epub 2020 Sep 12.
6
Python code smells detection using conventional machine learning models.使用传统机器学习模型检测Python代码异味。
PeerJ Comput Sci. 2023 May 29;9:e1370. doi: 10.7717/peerj-cs.1370. eCollection 2023.
7
Machine learning-based test smell detection.基于机器学习的测试异味检测。
Empir Softw Eng. 2024;29(2):55. doi: 10.1007/s10664-023-10436-2. Epub 2024 Mar 5.
8
Evaluation of the Structure and Geometric Properties of Crushed Igneous Rock Aggregates.碎火成岩集料的结构和几何特性评估
Materials (Basel). 2021 Nov 25;14(23):7202. doi: 10.3390/ma14237202.
9
Data-driven modeling and prediction of blood glucose dynamics: Machine learning applications in type 1 diabetes.基于数据驱动的血糖动力学建模与预测:机器学习在 1 型糖尿病中的应用。
Artif Intell Med. 2019 Jul;98:109-134. doi: 10.1016/j.artmed.2019.07.007. Epub 2019 Jul 26.
10
Performance of a Computational Model of the Mammalian Olfactory System哺乳动物嗅觉系统计算模型的性能

引用本文的文献

1
Machine learning-based test smell detection.基于机器学习的测试异味检测。
Empir Softw Eng. 2024;29(2):55. doi: 10.1007/s10664-023-10436-2. Epub 2024 Mar 5.

本文引用的文献

1
What is a support vector machine?什么是支持向量机?
Nat Biotechnol. 2006 Dec;24(12):1565-7. doi: 10.1038/nbt1206-1565.