Weatherby Gerard, Gryk Michael R
UCONN Health.
University of Illinois, Urbana-Champaign.
Int J Digit Curation. 2020;15(1). doi: 10.2218/ijdc.v15i1.709.
This paper reports on the ongoing activities and curation practices of the National Center for Biomolecular NMR Data Processing and Analysis. Over the past several years, the Center has been developing and extending computational workflow management software for use by a community of biomolecular NMR spectroscopists. Previous work had been to refactor the workflow system to utilize the PREMIS framework for reporting retrospective provenance as well as for sharing workflows between scientists and to support data reuse. In this paper, we report on our recent efforts to embed analytics within the workflow execution and within provenance tracking. Important metrics for each of the intermediate datasets are included within the corresponding PREMIS intellectual object, which allows for both inspection of the operation of individual actors as well as visualization of the changes throughout a full processing workflow. These metrics can be viewed within the workflow management system or through standalone metadata widgets. Our approach is to support a hybrid approach of both automated, workflow execution as well as manual intervention and metadata management. In this combination, the workflow system and metadata widgets encourage the domain experts to be avid curators of the data which they create, fostering both computational reproducibility and scientific data reuse.
本文报道了国家生物分子核磁共振数据处理与分析中心正在进行的活动和管理实践。在过去几年中,该中心一直在开发和扩展计算工作流管理软件,供生物分子核磁共振光谱学家群体使用。之前的工作是重构工作流系统,以利用PREMIS框架报告追溯来源,以及在科学家之间共享工作流并支持数据重用。在本文中,我们报告了我们最近在工作流执行和来源跟踪中嵌入分析的努力。每个中间数据集的重要指标都包含在相应的PREMIS知识对象中,这既允许检查单个参与者的操作,也允许可视化整个处理工作流中的变化。这些指标可以在工作流管理系统中查看,也可以通过独立的元数据小部件查看。我们的方法是支持自动化工作流执行以及人工干预和元数据管理的混合方法。在这种组合中,工作流系统和元数据小部件鼓励领域专家积极管理他们创建的数据,促进计算可重复性和科学数据重用。