Variant Bio Inc., Seattle, WA, USA.
Department of Biochemistry, University of Otago, Dunedin, New Zealand.
BMC Genomics. 2021 Nov 1;22(1):666. doi: 10.1186/s12864-021-07949-9.
Historically, geneticists have relied on genotyping arrays and imputation to study human genetic variation. However, an underrepresentation of diverse populations has resulted in arrays that poorly capture global genetic variation, and a lack of reference panels. This has contributed to deepening global health disparities. Whole genome sequencing (WGS) better captures genetic variation but remains prohibitively expensive. Thus, we explored WGS at "mid-pass" 1-7x coverage.
Here, we developed and benchmarked methods for mid-pass sequencing. When applied to a population without an existing genomic reference panel, 4x mid-pass performed consistently well across ethnicities, with high recall (98%) and precision (97.5%).
Compared to array data imputed into 1000 Genomes, mid-pass performed better across all metrics and identified novel population-specific variants with potential disease relevance. We hope our work will reduce financial barriers for geneticists from underrepresented populations to characterize their genomes prior to biomedical genetic applications.
遗传学家一直依赖基因分型阵列和推测来研究人类遗传变异。然而,代表性不足的多种族群体导致了对全球遗传变异的捕捉效果不佳的阵列,并且缺乏参考面板。这导致了全球健康差距的加深。全基因组测序(WGS)更好地捕捉遗传变异,但仍然过于昂贵。因此,我们探索了“中程”1-7x 覆盖率的 WGS。
在这里,我们开发并基准测试了中程测序的方法。当应用于没有现有基因组参考面板的人群时,4x 中程在各个种族中表现一致,具有高召回率(98%)和高精度(97.5%)。
与推测为 1000 Genomes 的数组数据相比,中程在所有指标上表现都更好,并确定了具有潜在疾病相关性的新的特定于人群的变体。我们希望我们的工作将降低代表性不足的人群的遗传学家在进行生物医学遗传应用之前对其基因组进行特征描述的经济障碍。