Rani Pooja, Petrulio Fernando, Bacchelli Alberto
Department of Informatics, University of Zurich, Zurich, Switzerland.
Empir Softw Eng. 2024;29(5):115. doi: 10.1007/s10664-024-10511-2. Epub 2024 Jul 24.
Researchers testing hypotheses related to factors leading to low-quality software often rely on historical data, specifically on details regarding when defects were introduced into a codebase of interest. The prevailing techniques to determine the introduction of defects revolve around variants of the SZZ algorithm. This algorithm leverages information on the lines modified during a bug-fixing commit and finds when these lines were last modified, thereby identifying bug-introducing commits.
Despite several improvements and variants, SZZ struggles with accuracy, especially in cases of unrelated modifications or that touch files not involved in the introduction of the bug in the version control systems (aka and ).
Our research investigates whether and how incorporating content retrieved from bug discussions can address these issues by identifying the related and external files and thus improve the efficacy of the SZZ algorithm.
To conduct our investigation, we take advantage of the links manually inserted by Mozilla developers in bug reports to signal which commits inserted bugs. Thus, we prepared the dataset, , comprised of 12,472 bug reports. We first manually inspect a sample of 369 bug reports related to these bug-fixing or bug-introducing commits and investigate whether the files mentioned in these reports could be useful for SZZ. After we found evidence that the mentioned files are relevant, we augment SZZ with this information, using different strategies, and evaluate the resulting approach against multiple SZZ variations.
We define a taxonomy outlining the rationale behind developers' references to diverse files in their discussions. We observe that bug discussions often mention files relevant to enhancing the SZZ algorithm's efficacy. Then, we verify that integrating these file references augments the precision of SZZ in pinpointing bug-introducing commits. Yet, it does not markedly influence recall. These results deepen our comprehension of the usefulness of bug discussions for SZZ. Future work can leverage our dataset and explore other techniques to further address the problem of tangled commits and ghost commits. Data & material: https://zenodo.org/records/11484723.
测试与导致软件质量低下的因素相关假设的研究人员通常依赖历史数据,特别是关于缺陷何时被引入到感兴趣的代码库中的细节。确定缺陷引入的主流技术围绕SZZ算法的变体展开。该算法利用在修复漏洞提交期间修改的行的信息,并找到这些行最后一次被修改的时间,从而识别引入漏洞的提交。
尽管有一些改进和变体,但SZZ在准确性方面仍存在问题,特别是在不相关修改或涉及版本控制系统中未参与漏洞引入的文件(又名 和 )的情况下。
我们的研究调查了纳入从漏洞讨论中检索到的内容是否以及如何通过识别相关文件和外部文件来解决这些问题,从而提高SZZ算法的有效性。
为了进行我们的调查,我们利用Mozilla开发人员在漏洞报告中手动插入的链接来标记哪些提交插入了漏洞。因此,我们准备了数据集 ,其中包含12472个漏洞报告。我们首先手动检查与这些修复漏洞或引入漏洞的提交相关的369个漏洞报告样本,并调查这些报告中提到的文件是否对SZZ有用。在我们发现所提及的文件相关的证据后,我们使用不同策略将这些信息添加到SZZ中,并针对多个SZZ变体评估由此产生的方法。
我们定义了一种分类法,概述了开发人员在讨论中引用不同文件背后的基本原理。我们观察到漏洞讨论经常提到与提高SZZ算法有效性相关的文件。然后,我们验证整合这些文件引用会提高SZZ在确定引入漏洞的提交方面的精度。然而,它并没有显著影响召回率。这些结果加深了我们对漏洞讨论对SZZ有用性的理解。未来的工作可以利用我们的数据集并探索其他技术,以进一步解决纠结提交和幽灵提交的问题。数据与材料:https://zenodo.org/records/11484723。