National Institute for Health Research (NIHR) Bristol Biomedical Research Centre (BRC), Bristol Medical School (Population Health Sciences), University of Bristol, Oakfield House, Bristol, BS8 2BN, UK.
Medical Research Council (MRC) Integrative Epidemiology Unit (IEU), Bristol Medical School (Population Health Sciences), University of Bristol, Oakfield House, Bristol, BS8 2BN, UK.
Genome Biol. 2021 Jan 13;22(1):32. doi: 10.1186/s13059-020-02248-0.
GWAS summary statistics are fundamental for a variety of research applications yet no common storage format has been widely adopted. Existing tabular formats ambiguously or incompletely store information about genetic variants and associations, lack essential metadata and are typically not indexed yielding poor query performance and increasing the possibility of errors in data interpretation and post-GWAS analyses. To address these issues, we adapted the variant call format to store GWAS summary statistics (GWAS-VCF) and developed open-source tools to use this format in downstream analyses. We provide open access to over 10,000 complete GWAS summary datasets converted to this format ( https://gwas.mrcieu.ac.uk ).
GWAS 汇总统计数据是各种研究应用的基础,但尚未广泛采用通用的存储格式。现有的表格格式在存储遗传变异体和关联方面信息时不够明确或不完整,缺乏必要的元数据,并且通常没有索引,导致查询性能较差,增加了数据解释和 GWAS 后分析中出错的可能性。为了解决这些问题,我们改编了变体调用格式来存储 GWAS 汇总统计数据(GWAS-VCF),并开发了开源工具来在下游分析中使用此格式。我们提供了超过 10000 个完整的 GWAS 汇总数据集转换为此格式的公开访问(https://gwas.mrcieu.ac.uk)。