Rath Michael, Mäder Patrick
DLR Institute of Data Science, Jena, Germany.
Technische Universität Ilmenau, Ilmenau, Germany.
Data Brief. 2019 May 24;25:104005. doi: 10.1016/j.dib.2019.104005. eCollection 2019 Aug.
This paper provides a systematically retrieved dataset consisting of 33 open-source software projects containing a large number of typed artifacts and trace links between them. The artifacts stem from the projects' issue tracking system and source version control system to enable their joint analysis. Enriched with additional metadata, such as time stamps, release versions, component information, and developer comments, the dataset is highly suitable for empirical research, e.g., in requirements and software traceability analysis, software evolution, bug and feature localization, and stakeholder collaboration. It can stimulate new research directions, facilitate the replication of existing studies, and act as benchmark for the comparison of competing approaches. The data is hosted on Harvard Dataverse using DOI 10.7910/DVN/PDDZ4Q accessible via https://bit.ly/2wukCHc.
本文提供了一个经过系统检索的数据集,该数据集由33个开源软件项目组成,包含大量类型化工件以及它们之间的跟踪链接。这些工件源自项目的问题跟踪系统和源版本控制系统,以便对它们进行联合分析。该数据集还丰富了诸如时间戳、发布版本、组件信息和开发者评论等附加元数据,非常适合进行实证研究,例如在需求和软件可追溯性分析、软件演化、错误和功能定位以及利益相关者协作等方面。它可以激发新的研究方向,促进现有研究的复制,并作为比较竞争方法的基准。该数据托管在哈佛数据文库(Harvard Dataverse)上,其数字对象标识符(DOI)为10.7910/DVN/PDDZ4Q,可通过https://bit.ly/2wukCHc访问。