Department of Human Genetics, University of Utah, Salt Lake City, UT, USA.
Utah Center for Genetic Discovery, University of Utah, Salt Lake City, UT, USA.
Nat Commun. 2021 Apr 12;12(1):2151. doi: 10.1038/s41467-021-22381-z.
The rapid increase in the amount of genomic data provides researchers with an opportunity to integrate diverse datasets and annotations when addressing a wide range of biological questions. However, genomic datasets are deposited on different platforms and are stored in numerous formats from multiple genome builds, which complicates the task of collecting, annotating, transforming, and integrating data as needed. Here, we developed Go Get Data (GGD) as a fast, reproducible approach to installing standardized data recipes. GGD is available on Github ( https://gogetdata.github.io/ ), is extendable to other data types, and can streamline the complexities typically associated with data integration, saving researchers time and improving research reproducibility.
基因组数据量的快速增长为研究人员在解决各种生物学问题时整合不同数据集和注释提供了机会。然而,基因组数据集存储在不同的平台上,并以多种格式存储在多个基因组构建中,这增加了收集、注释、转换和按需集成数据的任务的复杂性。在这里,我们开发了 Go Get Data (GGD),作为一种快速、可重复的安装标准化数据配方的方法。GGD 可在 Github 上获得(https://gogetdata.github.io/),可扩展到其他数据类型,并可以简化通常与数据集成相关的复杂性,为研究人员节省时间并提高研究的可重复性。