理解并提高Jupyter笔记本的质量和可重复性。

Understanding and improving the quality and reproducibility of Jupyter notebooks.

作者信息

Pimentel João Felipe, Murta Leonardo, Braganholo Vanessa, Freire Juliana

机构信息

Instituto de Computação, Universidade Federal Fluminense, Niterói, RJ Brazil.

Department of Computer Science and Engineering, New York University, New York, NY USA.

出版信息

Empir Softw Eng. 2021;26(4):65. doi: 10.1007/s10664-021-09961-9. Epub 2021 May 8.

DOI:10.1007/s10664-021-09961-9

PMID:33994841

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8106381/

Abstract

Jupyter Notebooks have been widely adopted by many different communities, both in science and industry. They support the creation of literate programming documents that combine code, text, and execution results with visualizations and other rich media. The self-documenting aspects and the ability to reproduce results have been touted as significant benefits of notebooks. At the same time, there has been growing criticism that the way in which notebooks are being used leads to unexpected behavior, encourages poor coding practices, and makes it hard to reproduce its results. To better understand good and bad practices used in the development of real notebooks, in prior work we studied 1.4 million notebooks from GitHub. We presented a detailed analysis of their characteristics that impact reproducibility, proposed best practices that can improve the reproducibility, and discussed open challenges that require further research and development. In this paper, we extended the analysis in four different ways to validate the hypothesis uncovered in our original study. First, we separated a group of popular notebooks to check whether notebooks that get more attention have more quality and reproducibility capabilities. Second, we sampled notebooks from the full dataset for an in-depth qualitative analysis of what constitutes the dataset and which features they have. Third, we conducted a more detailed analysis by isolating library dependencies and testing different execution orders. We report how these factors impact the reproducibility rates. Finally, we mined association rules from the notebooks. We discuss patterns we discovered, which provide additional insights into notebook reproducibility. Based on our findings and best practices we proposed, we designed Julynter, a Jupyter Lab extension that identifies potential issues in notebooks and suggests modifications that improve their reproducibility. We evaluate Julynter with a remote user experiment with the goal of assessing Julynter recommendations and usability.

摘要

Jupyter笔记本已被科学和工业等许多不同领域广泛采用。它们支持创建将代码、文本、执行结果与可视化及其他富媒体相结合的文学编程文档。自我记录的特性以及重现结果的能力被视为笔记本的显著优势。与此同时，越来越多的批评指出，笔记本的使用方式会导致意外行为，助长不良编码习惯，并且难以重现其结果。为了更好地理解实际笔记本开发中使用的良好和不良做法，在之前的工作中，我们研究了来自GitHub的140万个笔记本。我们对影响可重复性的特征进行了详细分析，提出了可提高可重复性的最佳实践，并讨论了需要进一步研究和开发的开放挑战。在本文中，我们以四种不同方式扩展了分析，以验证我们原始研究中发现的假设。首先，我们分离出一组热门笔记本，以检查获得更多关注的笔记本是否具有更高的质量和可重复性。其次，我们从完整数据集中抽取笔记本，对数据集的构成以及它们具有哪些特征进行深入的定性分析。第三，我们通过隔离库依赖项并测试不同的执行顺序进行更详细的分析。我们报告这些因素如何影响可重复性率。最后，我们从笔记本中挖掘关联规则。我们讨论发现的模式，这些模式为笔记本的可重复性提供了更多见解。基于我们的发现和提出的最佳实践，我们设计了Julynter，这是一个Jupyter Lab扩展，可识别笔记本中的潜在问题并提出改进其可重复性的修改建议。我们通过远程用户实验对Julynter进行评估，目的是评估Julynter的建议和可用性。