Translational Disease Systems Biology, Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Blegdamsvej 3B, DK-2200, Copenhagen N, Denmark.
Department of Biomedicine, Aarhus University, Høegh-Guldbergsgade 10, DK-8000, Aarhus C, Denmark.
BMC Genom Data. 2023 May 27;24(1):30. doi: 10.1186/s12863-023-01132-7.
Allele counts of sequence variants obtained by whole genome sequencing (WGS) often play a central role in interpreting the results of genetic and genomic research. However, such variant counts are not readily available for individuals in the Danish population. Here, we present a dataset with allele counts for sequence variants (single nucleotide variants (SNVs) and indels) identified from WGS of 8,671 (5,418 females) individuals from the Danish population. The data resource is based on WGS data from three independent research projects aimed at assessing genetic risk factors for cardiovascular, psychiatric, and headache disorders. To enable the sharing of information on sequence variation in Danish individuals, we created summarized statistics on allele counts from anonymized data and made them available through the European Genome-phenome Archive (EGA, https://identifiers.org/ega.
EGAD00001009756 ) and in a dedicated browser, DanMAC5 (available at www.danmac5.dk ). The summary level data and the DanMAC5 browser provide insight into the allelic spectrum of sequence variants segregating in the Danish population, which is important in variant interpretation.
Three WGS datasets with an average coverage of 30x were processed independently using the same quality control pipeline. Subsequently, we summarized, filtered, and merged allele counts to create a high-quality summary level dataset of sequence variants.
通过全基因组测序(WGS)获得的序列变异等位基因计数在解释遗传和基因组研究结果方面常常起着核心作用。然而,丹麦人群中个体的此类变异等位基因计数尚不可用。在这里,我们提供了一个数据集,其中包含了从丹麦人群中 8671 名(5418 名女性)个体的 WGS 中鉴定出的序列变异(单核苷酸变异(SNV)和插入缺失)的等位基因计数。该数据资源基于三个旨在评估心血管、精神和头痛疾病遗传风险因素的独立研究项目的 WGS 数据。为了能够共享丹麦个体中序列变异的信息,我们从匿名数据中创建了等位基因计数的汇总统计信息,并通过欧洲基因组-表型档案(EGA,https://identifiers.org/ega.
EGAD00001009756)和专用浏览器 DanMAC5(可在 www.danmac5.dk 上获得)提供这些信息。汇总水平数据和 DanMAC5 浏览器可深入了解在丹麦人群中分离的序列变异的等位基因谱,这对于变异解释很重要。
使用相同的质量控制流程独立处理了三个平均覆盖 30 倍的 WGS 数据集。随后,我们对等位基因计数进行了总结、过滤和合并,以创建一个高质量的序列变异汇总数据集。