Suppr超能文献

快速补救措施及其对采矿软件存储库的影响。

Quick remedy commits and their impact on mining software repositories.

作者信息

Wen Fengcai, Nagy Csaba, Lanza Michele, Bavota Gabriele

机构信息

Software Institute, USI Università della Svizzera italiana, Lugano, Switzerland.

出版信息

Empir Softw Eng. 2022;27(1):14. doi: 10.1007/s10664-021-10051-z. Epub 2021 Oct 28.

Abstract

Most changes during software maintenance and evolution are not atomic changes, but rather the result of several related changes affecting different parts of the code. It may happen that developers omit needed changes, thus leaving a task partially unfinished, introducing technical debt or injecting bugs. We present a study investigating "" performed by developers to implement changes omitted in previous commits. With we refer to commits that (i) follow a commit performed by the same developer, and (ii) aim at issues introduced as the result of code changes omitted in the previous commit (e.g., fix references to code components that have been broken as a consequence of a rename refactoring) or simply improve the previously committed change (e.g., improve the name of a newly introduced variable). Through a manual analysis of 500 quick remedy commits, we define a taxonomy categorizing the types of changes that developers tend to omit. The taxonomy can (i) guide the development of tools aimed at detecting omitted changes and (ii) help researchers in identifying corner cases that must be properly handled. For example, one of the categories in our taxonomy groups the , meaning changes that are undone in a subsequent commit. We show that not accounting for such commits when mining software repositories can undermine one's findings. In particular, our results show that considering completely reverted commits when mining software repositories accounts, on average, for 0.07 and 0.27 noisy data points when dealing with two typical MSR data collection tasks (i.e., bug-fixing commits identification and refactoring operations mining, respectively).

摘要

软件维护和演化过程中的大多数更改并非原子性更改,而是由影响代码不同部分的多个相关更改所导致的结果。开发人员可能会遗漏所需的更改,从而使任务部分未完成,引入技术债务或注入错误。我们提出了一项研究,调查开发人员为实现先前提交中遗漏的更改而执行的“ ”。我们所说的 是指这样的提交:(i)紧跟同一开发人员执行的提交之后,并且(ii)旨在解决由于先前提交中遗漏的代码更改而引入的问题(例如,修复因重命名重构而中断的对代码组件的引用),或者只是改进先前提交的更改(例如,改进新引入变量的名称)。通过对500个快速补救提交进行人工分析,我们定义了一个分类法,对开发人员容易遗漏的更改类型进行分类。该分类法可以(i)指导旨在检测遗漏更改的工具的开发,并且(ii)帮助研究人员识别必须妥善处理的极端情况。例如,我们分类法中的一个类别对 进行了分组,即指在后续提交中被撤销的更改。我们表明,在挖掘软件仓库时不考虑此类提交可能会破坏研究结果。特别是,我们的结果表明,在处理两个典型的MSR数据收集任务(即分别为错误修复提交识别和重构操作挖掘)时,在挖掘软件仓库时考虑完全还原的提交平均会产生0.07和0.27个噪声数据点。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4dde/8553712/5ace234c1899/10664_2021_10051_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验