计算可重复性的成功与挑战:来自脆弱家庭挑战的经验教训。
Successes and Struggles with Computational Reproducibility: Lessons from the Fragile Families Challenge.
作者信息
Liu David M, Salganik Matthew J
机构信息
Princeton University, Princeton, NJ, USA.
出版信息
Socius. 2019 Jan-Dec;5. doi: 10.1177/2378023119849803. Epub 2019 Sep 10.
Reproducibility is fundamental to science, and an important component of reproducibility is computational reproducibility: the ability of a researcher to recreate the results of a published study using the original author's raw data and code. Although most people agree that computational reproducibility is important, it is still difficult to achieve in practice. In this article, the authors describe their approach to enabling computational reproducibility for the 12 articles in this special issue of about the Fragile Families Challenge. The approach draws on two tools commonly used by professional software engineers but not widely used by academic researchers: software containers (e.g., Docker) and cloud computing (e.g., Amazon Web Services). These tools made it possible to standardize the computing environment around each submission, which will ease computational reproducibility both today and in the future. Drawing on their successes and struggles, the authors conclude with recommendations to researchers and journals.
可重复性是科学的基础,而计算可重复性是可重复性的一个重要组成部分:研究人员使用原始作者的原始数据和代码重新创建已发表研究结果的能力。尽管大多数人都认为计算可重复性很重要,但在实践中仍然难以实现。在本文中,作者描述了他们为本期关于“脆弱家庭挑战”的特刊中的12篇文章实现计算可重复性的方法。该方法借鉴了专业软件工程师常用但学术研究人员未广泛使用的两种工具:软件容器(如Docker)和云计算(如亚马逊网络服务)。这些工具使围绕每篇投稿的计算环境标准化成为可能,这将在当下和未来都简化计算可重复性。基于他们的成功与困难,作者最后向研究人员和期刊提出了建议。