Suppr超能文献

数据分析工作流程原则。

Principles for data analysis workflows.

机构信息

Berkeley Institute for Data Science, University of California Berkeley, Berkeley, California, United States of America.

Statistical & Data Sciences Program, Smith College, Northampton, Massachusetts, United States of America.

出版信息

PLoS Comput Biol. 2021 Mar 18;17(3):e1008770. doi: 10.1371/journal.pcbi.1008770. eCollection 2021 Mar.

Abstract

A systematic and reproducible "workflow"-the process that moves a scientific investigation from raw data to coherent research question to insightful contribution-should be a fundamental part of academic data-intensive research practice. In this paper, we elaborate basic principles of a reproducible data analysis workflow by defining 3 phases: the Explore, Refine, and Produce Phases. Each phase is roughly centered around the audience to whom research decisions, methodologies, and results are being immediately communicated. Importantly, each phase can also give rise to a number of research products beyond traditional academic publications. Where relevant, we draw analogies between design principles and established practice in software development. The guidance provided here is not intended to be a strict rulebook; rather, the suggestions for practices and tools to advance reproducible, sound data-intensive analysis may furnish support for both students new to research and current researchers who are new to data-intensive work.

摘要

一个系统的、可重复的“工作流程”——将科学研究从原始数据转化为连贯的研究问题并得出有见地的贡献的过程——应该是学术数据密集型研究实践的基本组成部分。在本文中,我们通过定义三个阶段来详细阐述可重复数据分析工作流程的基本原则:探索阶段、精炼阶段和产出阶段。每个阶段大致以正在即时交流研究决策、方法和结果的受众为中心。重要的是,每个阶段还可以产生许多超出传统学术出版物的研究产品。在相关的地方,我们还借鉴了软件开发中既定设计原则和实践之间的类比。这里提供的指导意见并非严格的规则手册;相反,推进可重复的、合理的数据密集型分析的实践和工具建议,可以为新接触研究的学生和新接触数据密集型工作的现有研究人员提供支持。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/90ed/7971542/af711fa2f367/pcbi.1008770.g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验